{"id":27814,"date":"2021-03-10T10:03:23","date_gmt":"2021-03-10T09:03:23","guid":{"rendered":"https:\/\/www.inovex.de\/blog\/?p=20870"},"modified":"2022-09-26T08:50:43","modified_gmt":"2022-09-26T06:50:43","slug":"data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads","status":"publish","type":"post","link":"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/","title":{"rendered":"Data processing scaled up and out with Dask and RAPIDS (3\/3)"},"content":{"rendered":"<p>This blog post tutorial shows how a scalable and high-performance environment for machine learning can be set up using the ingredients GPUs, Kubernetes clusters, Dask and Jupyter.\u00a0\u00a0In the preceding posts of our series, we have set up a GPU-enabled Kubernetes platform on GCP and deployed Jupyterhub as an interactive development environment for data scientists. Furthermore, we prepared a notebook image that has Dask and Dask-Rapids installed. Now it is time to actually do some coding and compare the results. In this final article, we will compare the efficiency of four approaches for a typical machine learning task: a random forest. We will implement it in Sklearn, which uses only one machine (and 2 cores), then we will parallelize the Sklearn code with Dask, and execute it on up to 4 machines (each with 2 cores). Finally, we will use the GPUs: a single one with Rapids and multiple with Dask-Rapids.<!--more--><\/p>\n<p>We can access JupyterHub on port 8000 from the browser, log in (if authentication is enabled) and we can see the workspace of our JupyterLab instance.<\/p>\n<p>For evaluation, we will take a look at the Dask-Rapids example, load a dataset and fit a Random Forest to it. After that, we will compare the performance between Sklearn (single-Node CPUs), Dask ML (multi-Node CPUs), cuML (single GPU) and Dask-cuML (multi GPU).\u00a0\u00a0About the dataset: We will use a real case dataset from the Santander Bank (customer-transaction-prediction) which is a 300MB .csv file.<\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_83 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\"><p class=\"ez-toc-title\" style=\"cursor:inherit\"><\/p>\n<\/div><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/#Sklearn\" >Sklearn<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/#Dask\" >Dask<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/#Rapids-%E2%80%93-cuML\" >Rapids &#8211; cuML<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/#Dask-Rapids-%E2%80%93-cuML\" >Dask-Rapids &#8211; cuML<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/#Results-and-Experiences\" >Results and Experiences<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/#Troubleshooting\" >Troubleshooting<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/#Summary\" >Summary<\/a><\/li><\/ul><\/nav><\/div>\n<h2><span class=\"ez-toc-section\" id=\"Sklearn\"><\/span>Sklearn<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Let&#8217;s start with the Sklearn example. We load the dataset with Pandas, split the dataset into <i>train<\/i> and <i>test<\/i> (80\/20) and specify the parameters for the forest. After that, we can start the fit function, wait until it is done and then make predictions for the verification part. Finally, we can take a look at the score. The code and results:<\/p>\n<pre class=\"lang:python decode:true\">from dask_kubernetes import KubeCluster\r\nimport joblib\r\nimport distributed\r\nfrom sklearn.ensemble import RandomForestClassifier as sklRF\r\nimport multiprocessing as mp\r\nimport pandas as pd\r\nfrom sklearn.model_selection import train_test_split\r\nfrom sklearn.metrics import accuracy_score\r\nimport dask.dataframe as dd\r\n\r\ncluster = KubeCluster.from_yaml('worker-spec.yml')\r\n\r\ncluster.scale_up(2)\u00a0 # specify number of nodes explicitly\r\n\r\n# Connect dask to the cluster\r\nclient = distributed.Client(cluster)\r\n\r\n\r\n# read from Bucket\r\ncpu_train = pd.read_csv('gs:\/\/dask_rapids\/train.csv')\r\n\r\n\r\ncpu_train_x, cpu_test_x, cpu_train_y, cpu_test_y = train_test_split( cpu_train.iloc[:,2:], cpu_train['target'], test_size=0.2, shuffle=True)\r\nskl_rf_params = {\r\n\u00a0 \u00a0 'n_estimators': 35,\r\n\u00a0 \u00a0 'max_depth': 26,\r\n\u00a0 \u00a0 'n_jobs': 2 }\r\n\u00a0 \u00a0\r\n\r\ndd_train_x = dd.from_pandas(cpu_train_x,npartitions=4)\r\ndd_train_y = dd.from_pandas(cpu_train_y,npartitions=4)\r\n\r\ndd_train_x.persist()\r\ndd_train_y.persist()\r\n\r\nskl_rf = sklRF(**skl_rf_params)\r\n\r\n%time\r\nwith joblib.parallel_backend('dask'):\r\n\r\n\u00a0 \u00a0 %time skl_rf.fit(dd_train_x, dd_train_y)\r\n\u00a0 \u00a0\r\n\r\n\r\n%time predictions = predict_cpu = skl_rf.predict(cpu_test_x)\r\naccuracy_score(cpu_test_y, predictions)<\/pre>\n<p>An important aspect here is the n_jobs\u00a0 parameter (the number of available cores, in our case: 2) in the specification of our forest. With Sklearn, you can use all the cores of a single machine to speed up computing. But it would be neat to use more than a single machine has to offer \u2013 for example to combine the cores of all the nodes in our cluster. This is where Dask comes into play.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Dask\"><\/span>Dask<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Dask offers the possibility to easily parallelize the code we used above across the whole cluster. Actually, the only changes we have to apply is to persist the data across the workers and wrap the <i>fit <\/i>function with the <i>joblib.parallel_backend <\/i>function. Specification for the workers needs to be defined and GCSFS needs to be present on the workers as well, hence it is added under extra pip packages. This is how it looks in this case:<\/p>\n<pre class=\"lang:yaml decode:true \"># worker-spec.yml\r\n\r\nkind: Pod\r\nmetadata:\r\n\u00a0 labels:\r\n\u00a0 \u00a0 foo: bar\r\nspec:\r\n\u00a0 restartPolicy: Never\r\n\u00a0 containers:\r\n\u00a0 - image: registry.inovex.de:4567\/hsmannheim\/emq-dockerimages\/worker_no_cuda:03\r\n\u00a0 \u00a0 imagePullPolicy: IfNotPresent\r\n\u00a0 \u00a0 args: [dask-worker, --nthreads, '2', --no-bokeh, --memory-limit, 6GB, --death-timeout, '60']\r\n\u00a0 \u00a0 name: dask\r\n\u00a0 \u00a0 env:\r\n\u00a0 \u00a0 \u00a0 - name: EXTRA_PIP_PACKAGES\r\n\u00a0 \u00a0 \u00a0 \u00a0 value: gcsfs\r\n\u00a0 \u00a0 resources:\r\n\u00a0 \u00a0 \u00a0 limits:\r\n\u00a0 \u00a0 \u00a0 \u00a0 cpu: \"2\"\r\n\u00a0 \u00a0 \u00a0 \u00a0 memory: 6G\r\n\u00a0 \u00a0 \u00a0 requests:\r\n\u00a0 \u00a0 \u00a0 \u00a0 cpu: \"2\"\r\n\u00a0 \u00a0 \u00a0 \u00a0 memory: 6G\r\n\u00a0 \u00a0 volumeMounts:\r\n\u00a0 \u00a0 \u00a0 - name: read-bucket-configmap\r\n\u00a0 \u00a0 \u00a0 \u00a0 mountPath: \"\/home\/bucketCredentials\/\"\r\n\u00a0 volumes:\r\n\u00a0 - name: read-bucket-configmap\r\n\u00a0 \u00a0 configMap:\r\n\u00a0 \u00a0 \u00a0 name: readbuckets<\/pre>\n<p>The image parameter defines the registry where the image can be pulled from. One can use the official and up-to-date image <i>daskdev:latest. <\/i>Although the components of our base image with Cuda, Dask and Rapids are updated regularly and frequently, it may be a little bit behind. To avoid inconsistencies between the client (scheduler &#8211; Jupyter) and the workers we built an image for the workers with pinned versions instead. In the <i>worker-spec.yaml <\/i>we can also specify the resources or extra packages that need to be installed when starting the worker pod. An important part is mounting the config map with credentials to access the Bucket by setting the <i>Volumes <\/i>and <i>VolumesMounts<\/i> parameters. The Dockerfile for the workers can be seen here:<\/p>\n<pre class=\"lang:default decode:true \">FROM continuumio\/miniconda3:4.7.12\r\n\r\nRUN conda install --yes \\\r\n\u00a0 \u00a0 -c conda-forge \\\r\n\u00a0 \u00a0 python-blosc \\\r\n\u00a0 \u00a0 cytoolz \\\r\n\u00a0 \u00a0 dask==2.15.0 \\\r\n\u00a0 \u00a0 lz4 \\\r\n\u00a0 \u00a0 nomkl \\\r\n\u00a0 \u00a0 numpy==1.18.1 \\\r\n\u00a0 \u00a0 pandas==0.25.3 \\\r\n\u00a0 \u00a0 tini==0.18.0 \\\r\n\u00a0 \u00a0 zstd==1.4.3 \\\r\n\u00a0 \u00a0 distributed==2.15.2\\\r\n\u00a0 \u00a0 python==3.7.6\\\r\n\u00a0 \u00a0 sklearn \\\r\n\u00a0 \u00a0 &amp;&amp; conda clean -tipsy \\\r\n\u00a0 \u00a0 &amp;&amp; find \/opt\/conda\/ -type f,l -name '*.a' -delete \\\r\n\u00a0 \u00a0 &amp;&amp; find \/opt\/conda\/ -type f,l -name '*.pyc' -delete \\\r\n\u00a0 \u00a0 &amp;&amp; find \/opt\/conda\/ -type f,l -name '*.js.map' -delete \\\r\n\u00a0 \u00a0 &amp;&amp; find \/opt\/conda\/lib\/python*\/site-packages\/bokeh\/server\/static -type f,l -name '*.js' -not -name '*.min.js' -delete \\\r\n\u00a0 \u00a0 &amp;&amp; rm -rf \/opt\/conda\/pkgs\r\n\r\nCOPY prepare.sh \/usr\/bin\/prepare.sh\r\n\r\nRUN mkdir \/opt\/app\r\n\r\nENTRYPOINT [\"tini\", \"-g\", \"--\", \"\/usr\/bin\/prepare.sh\"]<\/pre>\n<p>The <i>prepare.sh <\/i>can be copied from the official dask repository. It needs to reside in the same folder as the Dockerfile for the workers. <i>Prepare.sh<\/i>:<\/p>\n<pre class=\"lang:sh decode:true \">#!\/bin\/bash\r\n\r\nset -x\r\n\r\n# We start by adding extra apt packages, since pip modules may required library\r\nif [ \"$EXTRA_APT_PACKAGES\" ]; then\r\n\u00a0 \u00a0 echo \"EXTRA_APT_PACKAGES environment variable found.\u00a0 Installing.\"\r\n\u00a0 \u00a0 apt update -y\r\n\u00a0 \u00a0 apt install -y $EXTRA_APT_PACKAGES\r\nfi\r\n\r\nif [ -e \"\/opt\/app\/environment.yml\" ]; then\r\n\u00a0 \u00a0 echo \"environment.yml found. Installing packages\"\r\n\u00a0 \u00a0 \/opt\/conda\/bin\/conda env update -f \/opt\/app\/environment.yml\r\nelse\r\n\u00a0 \u00a0 echo \"no environment.yml\"\r\nfi\r\n\r\nif [ \"$EXTRA_CONDA_PACKAGES\" ]; then\r\n\u00a0 \u00a0 echo \"EXTRA_CONDA_PACKAGES environment variable found.\u00a0 Installing.\"\r\n\u00a0 \u00a0 \/opt\/conda\/bin\/conda install -y $EXTRA_CONDA_PACKAGES\r\nfi\r\n\r\nif [ \"$EXTRA_PIP_PACKAGES\" ]; then\r\n\u00a0 \u00a0 echo \"EXTRA_PIP_PACKAGES environment variable found.\u00a0 Installing\".\r\n\u00a0 \u00a0 \/opt\/conda\/bin\/pip install $EXTRA_PIP_PACKAGES\r\nfi\r\n\r\n# Run extra commands\r\nexec \"$@\"<\/pre>\n<p>Let\u2019s take a look at the code and see how much better we can get. First, we will use only one worker, with 2 cores and 6GB of RAM. Then 2, 3 and finally 4 workers:<\/p>\n<pre class=\"lang:python decode:true \">from dask_kubernetes import KubeCluster\r\nimport joblib\r\nimport distributed\r\nfrom sklearn.ensemble import RandomForestClassifier as sklRF\r\nimport multiprocessing as mp\r\nimport pandas as pd\r\nfrom sklearn.model_selection import train_test_split\r\nfrom sklearn.metrics import accuracy_score\r\nimport dask.dataframe as dd\r\n\r\ncluster = KubeCluster.from_yaml('worker-spec.yml')\r\n\r\ncluster.scale_up(2)\u00a0 # specify number of nodes explicitly\r\n\r\n# Connect dask to the cluster\r\nclient = distributed.Client(cluster)\r\n\r\ncpu_train = pd.read_csv('gs:\/\/dask_rapids\/train.csv')\r\ncpu_train_x, cpu_test_x, cpu_train_y, cpu_test_y = train_test_split( cpu_train.iloc[:,2:], cpu_train['target'], test_size=0.2, shuffle=True)\r\nskl_rf_params = {\r\n\u00a0 \u00a0 'n_estimators': 35,\r\n\u00a0 \u00a0 'max_depth': 26,\r\n\u00a0 \u00a0 'n_jobs': 2 }\r\n\u00a0 \u00a0\r\n\r\ndd_train_x = dd.from_pandas(cpu_train_x,npartitions=4)\r\ndd_train_y = dd.from_pandas(cpu_train_y,npartitions=4)\r\n\r\ndd_train_x.persist()\r\ndd_train_y.persist()\r\n\r\nskl_rf = sklRF(**skl_rf_params)\r\n\r\n%time\r\nwith joblib.parallel_backend('dask'):\r\n\r\n\u00a0 \u00a0 %time skl_rf.fit(dd_train_x, dd_train_y)\r\n\u00a0 \u00a0\r\n\r\n\r\n%time predictions = predict_cpu = skl_rf.predict(cpu_test_x)\r\naccuracy_score(cpu_test_y, predictions)<\/pre>\n<p>You can port-forward your Jupyter pod on 8787 and see the Dask Dashboard under <code>http:\/\/localhost:8787<\/code>. The workers, their tasks and resources can be viewed there. In the below example you can see 4 workers, with 2 CPUS (Threads) each, hence we see 8 Task Streams:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignright wp-image-20872 size-full\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image2.png\" alt=\"\" width=\"1999\" height=\"1080\" srcset=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image2.png 1999w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image2-300x162.png 300w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image2-1024x553.png 1024w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image2-768x415.png 768w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image2-1536x830.png 1536w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image2-1920x1037.png 1920w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image2-400x216.png 400w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image2-360x194.png 360w\" sizes=\"auto, (max-width: 1999px) 100vw, 1999px\" \/><\/p>\n<p>&nbsp;<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Rapids-%E2%80%93-cuML\"><\/span>Rapids &#8211; cuML<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>We have parallelized the Random Forest across the CPUs of our cluster. Now let&#8217;s use some GPUs. We start with focussing on one GPU with the RAPIDS cuML library. Since its API is similar to the one of Sklearn, the code will look similar as well. With cuDF we can read the dataset .csv file directly from the bucket. Splitting the dataset into <i>train<\/i> and <i>test<\/i> looks almost like in Sklearn. An important thing to keep in mind are the datatypes. Although we can train the Forest with <i>float64<\/i> if we want to use GPU-based prediction, we should use <i>float32 <\/i>for training. Labels should be <em>int32.<\/em><\/p>\n<pre class=\"lang:python decode:true\">from cuml import RandomForestClassifier as cumlRF\r\nimport cudf\r\nfrom cuml.preprocessing.model_selection import train_test_split\r\nfrom cuml.metrics.accuracy import accuracy_score\r\n\r\nsgdf_train = cudf.read_csv('gs:\/\/dask_rapids\/train.csv')\r\n#sgdf_test = cudf.read_csv('mnt\/santander-customer-transaction-prediction-dataset\/test*.csv')\r\n\r\nsgdf_train_x, sgdf_test_x, sgdf_train_y, sgdf_test_y = train_test_split(sgdf_train.iloc[:,2:].astype('float32'), sgdf_train['target'].astype('int32'),\r\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 shuffle= True,train_size=0.8 )\r\n\r\ncu_rf_params = {\r\n\u00a0 \u00a0 'n_estimators': 35,\r\n\u00a0 \u00a0 'max_depth': 26,\r\n\u00a0 \u00a0 'n_bins': 15,\r\n\u00a0 \u00a0 'n_streams': 8\r\n}\r\n\r\ncuml_rf = cumlRF(**cu_rf_params)\r\n\r\n%time cuml_rf.fit(sgdf_train_x, sgdf_train_y)\r\n\r\n%time predictions = cuml_rf.predict(sgdf_test_x.as_matrix(),predict_model='CPU')\r\n\r\n%time accuracy_score(sgdf_test_y,predictions )<\/pre>\n<h2><span class=\"ez-toc-section\" id=\"Dask-Rapids-%E2%80%93-cuML\"><\/span>Dask-Rapids &#8211; cuML<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>If one GPU is not enough, we can use more of them. With Dask-Rapids there are two possibilities. The first one, which will be shown here, is a Multi-GPU computing which can combine all the GPUs of a certain node. It is done by creating a <i>LocalCUDACluster, <\/i>which automatically recognizes the GPU\u2019s cluster on the node that runs Jupyter (in this case). For this scenario, no Docker image is needed for the workers simplifying the configuration. The second possibility is multi-GPU\/multi-node computing which takes one GPU from every considered node. In this case, similar to the Dask-Sklearn example from above, a specification and an image for every worker is needed.<\/p>\n<p>Like said before, we start by creating a <i>LocalCUDACluster(n_workers=n). <\/i>If we omit the <i>n_workers<\/i> specification, all available GPUs from our node will be taken. Then we connect to the client. A dashboard is available as well, just like in the Dask-Sklearn example. A little tip for the dashboard: <code>http:\/\/localhost:8787\/individual-gpu-memory<\/code>\u00a0and <code>http:\/\/localhost:8787\/individual-gpu-utilization<\/code> show more detailed information about the actual state of our GPUs.<\/p>\n<p>&nbsp;<\/p>\n<figure id=\"attachment_20874\" aria-describedby=\"caption-attachment-20874\" style=\"width: 1432px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-20874 size-full\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image8.png\" alt=\"Similar to plain Dask dashboard, Dask-Rapids offers one as well. Here we can see 3 streams representing 3 available GPUs.\" width=\"1432\" height=\"1190\" srcset=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image8.png 1432w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image8-300x249.png 300w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image8-1024x851.png 1024w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image8-768x638.png 768w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image8-400x332.png 400w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image8-360x299.png 360w\" sizes=\"auto, (max-width: 1432px) 100vw, 1432px\" \/><figcaption id=\"caption-attachment-20874\" class=\"wp-caption-text\">Similar to plain Dask dashboard, Dask-Rapids offers one as well. Here we can see 3 streams representing 3 available GPUs.<\/figcaption><\/figure>\n<p>Since Dask-cuDF does not offer splitting the dataset into <i>train<\/i> and <i>test<\/i> parts (at least at the time this article was written), we will use the standard cuDF read function, split our dataset, and finally convert it to Dask-cuDF Dataframe &amp; Series. For the convert step, we need to specify how many partitions we want for our data. Choosing a number which corresponds to the number of our GPU workers seems reasonable.<\/p>\n<p>A necessary step is to persist the data across all the workers. Unlike in the Dask-only version, a simple persist is not enough. We need to use the Dask-cuML function <i>persist_across_workers<\/i>. By this we make sure, all the workers (GPUs) will have access to the data while performing <i>fit <\/i>or <i>prediction<\/i>:<\/p>\n<pre class=\"lang:python decode:true \">from dask_cuda import LocalCUDACluster\r\nimport dask_cudf\r\nimport cudf\r\nfrom cuml.dask.ensemble import RandomForestClassifier as cuml_rf\r\nfrom cuml.dask.common.utils import persist_across_workers\r\nfrom cuml.preprocessing.model_selection import train_test_split\r\nfrom cuml.metrics.accuracy import accuracy_score\r\n\r\ncluster = LocalCUDACluster(n_workers=3)\r\n\r\nfrom dask.distributed import Client\r\nclient = Client(cluster)\r\n\r\nclient\r\n\r\nsgdf_train = cudf.read_csv('gs:\/\/dask_rapids\/train.csv')\r\n\r\nsgdf_train_x, sgdf_test_x, sgdf_train_y, sgdf_test_y = train_test_split(sgdf_train.iloc[:,2:].astype('float32'), sgdf_train['target'].astype('int32'),\r\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 shuffle= True,train_size=0.8 )\r\n\r\nmgdf_train_x = dask_cudf.from_cudf(sgdf_train_x, npartitions=3)\r\nmgdf_train_y = dask_cudf.from_cudf(sgdf_train_y,npartitions=3)\r\nmgdf_test_x = dask_cudf.from_cudf(sgdf_test_x, npartitions=3)\r\nmgdf_test_y = dask_cudf.from_cudf(sgdf_test_y,npartitions=3)\r\n\r\n#gdf_train_x = gdf_train.iloc[:,2:].astype('float32')\r\n#gdf_train_y = gdf_train['target'].astype('int32')\r\nmgdf_train_x, mgdf_train_y, mgdf_test_x, mgdf_test_y = persist_across_workers(client,[mgdf_train_x,mgdf_train_y,mgdf_test_x,mgdf_test_y])\r\n#mgdf_train_x, mgdf_train_y= persist_across_workers(client,[mgdf_train_x,mgdf_train_y])\r\n\r\ncu_rf_params = {\r\n\u00a0 \u00a0 'n_estimators': 35,\r\n\u00a0 \u00a0 'max_depth': 26,\r\n\u00a0 \u00a0 'n_bins': 15,\r\n\u00a0 \u00a0 'n_streams': 8\r\n}\r\n\r\ncuml_rf = cuml_rf(**cu_rf_params)\r\n\r\n\r\n%time cuml_rf.fit(mgdf_train_x, mgdf_train_y)\r\n\r\n%time predictions = cuml_rf.predict(mgdf_test_x,predict_model='GPU').compute()\r\n\r\n%time accuracy_score(mgdf_test_y.compute(),predictions )<\/pre>\n<h2><span class=\"ez-toc-section\" id=\"Results-and-Experiences\"><\/span>Results and Experiences<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Sklearn only took about 1min 13sec to fit the forest. Using Dask-only without any GPU, we can speed things up pretty nicely. Using only one worker yields a bit worse results than Sklearn, which is not surprising since costs (computational\/administrative overhead) of distributed computing are not to be ignored. However, with every new worker the computation time was decreasing, leading to an almost x2.5 speed-up with 4 workers.\u00a0The real game-changers are the GPUs. We can observe a x48 speedup with a single GPU compared to Sklearn and almost a x20 speedup compared to Dask with 4 workers.<\/p>\n<figure id=\"attachment_20875\" aria-describedby=\"caption-attachment-20875\" style=\"width: 1432px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-20875 size-full\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image8.png\" alt=\"Comparison of the Random-Forest training time for Sklearn, Dask (1-4 workers) and cuML. \" width=\"1432\" height=\"1190\" srcset=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image8.png 1432w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image8-300x249.png 300w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image8-1024x851.png 1024w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image8-768x638.png 768w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image8-400x332.png 400w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image8-360x299.png 360w\" sizes=\"auto, (max-width: 1432px) 100vw, 1432px\" \/><figcaption id=\"caption-attachment-20875\" class=\"wp-caption-text\">Comparison of the Random-Forest training time for Sklearn, Dask (1-4 workers) and cuML.<\/figcaption><\/figure>\n<p>We can do even better if we use Dask-Rapids and a few GPUs. The difference here is not that spectacular due to the size of the data. At some point, scaling brings no improvement. Even more, adding a 4th GPU would lead to an increase in computation time compared to 2 or 3 GPUs.<\/p>\n<figure id=\"attachment_20876\" aria-describedby=\"caption-attachment-20876\" style=\"width: 1999px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-20876 size-full\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image6.png\" alt=\"Comparison of training times for a single GPU and multiple GPUs. \" width=\"1999\" height=\"970\" srcset=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image6.png 1999w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image6-300x146.png 300w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image6-1024x497.png 1024w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image6-768x373.png 768w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image6-1536x745.png 1536w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image6-1920x932.png 1920w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image6-400x194.png 400w, https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/image6-360x175.png 360w\" sizes=\"auto, (max-width: 1999px) 100vw, 1999px\" \/><figcaption id=\"caption-attachment-20876\" class=\"wp-caption-text\">Comparison of training times for a single GPU and multiple GPUs.<\/figcaption><\/figure>\n<p>An interesting aspect is the prediction time. Here Sklearn clearly wins the fight with a prediction time of 324ms, while single GPU cuML needs about 897ms. Prediction time is even worse if we use distributed GPU computing with Dask-cuML. Here the prediction time was from 6.16s (2 GPU) to 4.44s (3 GPUs). The accuracy, however, was nearly the same at about 89.877% for GPU-based RF and 89.882% for Sklearn.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Troubleshooting\"><\/span>Troubleshooting<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>There may be a few problems you will encounter on your way trying to configure everything. Rapids is generally made for GPUs that have a Compute Capability (CC) of 6.0 or higher. So if you want to use, let&#8217;s say, a Tesla K80 with CC 3.7, you will not be able to fully use Rapids\u2019 functions and you will encounter the Error <i>no kernel image is available for execution on the device<\/i>. You can omit that by installing Rapids from source and changing the CC in the CMake file. However, there would not be a 100%-guarantee that everything works as it should.<\/p>\n<p>Other problems may occur while building your own images with Rapids and Dask. You have to keep the dependencies in mind. For example, Rapids (0.13) requires a lower version of Pandas than the one installed with Dask. You have to specify and pin versions. This helps with keeping the Dask-worker image in consistency with the client image.<\/p>\n<p>While deploying JupyterHub, it may take a long time to pull the image from the repository (about 15-20 minutes in my case). That is because the images are pretty big, having CUDA, Jupyter, Dask and Rapids installed. This may result in a timeout error. You can add a &#8211;timeout flag with a large number to the deployment command to avoid this.<\/p>\n<p>If you cannot access the Buckets from Jupyter, check whether you (Jupyter + Workers) have the access to the credentials and if the\u00a0 GOOGLE_APPLICATION_CREDENTIALS variable is correctly set.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Summary\"><\/span>Summary<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>In this blog post series, we have learned a few things. In part 1 we set up a Kubernetes cluster on GCP with accessible GPUs, including installing the chart manager Helm2 (or Helm3). After that, in part 2, we prepared the environment to work with: JupyterHub with proper notebook images \u2013 including the CUDA library \u2013 with Dask, Rapids and Dask-Rapids on top of that.<\/p>\n<p>Finally, we took a look at a practical use-case for Dask and Dask-Rapids, presenting a Random Forest implementation using 4 different methods. While Sklearn, using a single machine, is the slowest one, it is easily parallelized with Dask, which allows it to use more than a single machine and in the case of 4 workers enables a decent speedup of 2.5x. The real game-changers, however, are the GPUs. Rapids offers a Pandas-like interface and a single GPU increases the performance dramatically, resulting in a 48x speed-up over Sklearn. As if one GPU is not enough, one can use Dask-Rapids to combine several graphic units! But keep in mind, while training is much faster with GPUs, the prediction time is better on a CPU. So, as always, smart design for best efficiency is required.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This blog post tutorial shows how a scalable and high-performance environment for machine learning can be set up using the ingredients GPUs, Kubernetes clusters, Dask and Jupyter.\u00a0\u00a0In the preceding posts of our series, we have set up a GPU-enabled Kubernetes platform on GCP and deployed Jupyterhub as an interactive development environment for data scientists. Furthermore, [&hellip;]<\/p>\n","protected":false},"author":179,"featured_media":20919,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"ep_exclude_from_search":false,"footnotes":""},"tags":[],"service":[431],"coauthors":[{"id":179,"display_name":"Rafal Lokuciejewski","user_nicename":"rafal-lokuciejewskiinovex-de"}],"class_list":["post-27814","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","service-data-science"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.5 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Data processing scaled up and out with Dask and RAPIDS (3\/3) - inovex GmbH<\/title>\n<meta name=\"description\" content=\"This blog post shows how high-performance environment for machine learning can be set up using Dask and Rapids.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/\" \/>\n<meta property=\"og:locale\" content=\"de_DE\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Data processing scaled up and out with Dask and RAPIDS (3\/3) - inovex GmbH\" \/>\n<meta property=\"og:description\" content=\"This blog post shows how high-performance environment for machine learning can be set up using Dask and Rapids.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/\" \/>\n<meta property=\"og:site_name\" content=\"inovex GmbH\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/inovexde\" \/>\n<meta property=\"article:published_time\" content=\"2021-03-10T09:03:23+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-09-26T06:50:43+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/gpgpu-2.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1920\" \/>\n\t<meta property=\"og:image:height\" content=\"1080\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Rafal Lokuciejewski\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/gpgpu-2-1024x576.png\" \/>\n<meta name=\"twitter:creator\" content=\"@inovexgmbh\" \/>\n<meta name=\"twitter:site\" content=\"@inovexgmbh\" \/>\n<meta name=\"twitter:label1\" content=\"Verfasst von\" \/>\n\t<meta name=\"twitter:data1\" content=\"Rafal Lokuciejewski\" \/>\n\t<meta name=\"twitter:label2\" content=\"Gesch\u00e4tzte Lesezeit\" \/>\n\t<meta name=\"twitter:data2\" content=\"15\u00a0Minuten\" \/>\n\t<meta name=\"twitter:label3\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data3\" content=\"Rafal Lokuciejewski\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\\\/\"},\"author\":{\"name\":\"Rafal Lokuciejewski\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/person\\\/4852bd3d70d7e8d5453571bb27fc29c1\"},\"headline\":\"Data processing scaled up and out with Dask and RAPIDS (3\\\/3)\",\"datePublished\":\"2021-03-10T09:03:23+00:00\",\"dateModified\":\"2022-09-26T06:50:43+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\\\/\"},\"wordCount\":1823,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2021\\\/02\\\/gpgpu-2.png\",\"articleSection\":[\"Analytics\",\"English Content\",\"General\"],\"inLanguage\":\"de\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\\\/\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\\\/\",\"name\":\"Data processing scaled up and out with Dask and RAPIDS (3\\\/3) - inovex GmbH\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2021\\\/02\\\/gpgpu-2.png\",\"datePublished\":\"2021-03-10T09:03:23+00:00\",\"dateModified\":\"2022-09-26T06:50:43+00:00\",\"description\":\"This blog post shows how high-performance environment for machine learning can be set up using Dask and Rapids.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\\\/#breadcrumb\"},\"inLanguage\":\"de\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\\\/#primaryimage\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2021\\\/02\\\/gpgpu-2.png\",\"contentUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2021\\\/02\\\/gpgpu-2.png\",\"width\":1920,\"height\":1080},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Data processing scaled up and out with Dask and RAPIDS (3\\\/3)\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#website\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/\",\"name\":\"inovex GmbH\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"de\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#organization\",\"name\":\"inovex GmbH\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2021\\\/03\\\/inovex-logo-16-9-1.png\",\"contentUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2021\\\/03\\\/inovex-logo-16-9-1.png\",\"width\":1921,\"height\":1081,\"caption\":\"inovex GmbH\"},\"image\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/inovexde\",\"https:\\\/\\\/x.com\\\/inovexgmbh\",\"https:\\\/\\\/www.instagram.com\\\/inovexlife\\\/\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/inovex\",\"https:\\\/\\\/www.youtube.com\\\/channel\\\/UC7r66GT14hROB_RQsQBAQUQ\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/person\\\/4852bd3d70d7e8d5453571bb27fc29c1\",\"name\":\"Rafal Lokuciejewski\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/8bf762d23ce1a4aca8afafba67dce7d6b0dabbcb56999bbb2e41d56664f9bcb7?s=96&d=retro&r=ge3f981b6ae50c555514691c36d70131a\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/8bf762d23ce1a4aca8afafba67dce7d6b0dabbcb56999bbb2e41d56664f9bcb7?s=96&d=retro&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/8bf762d23ce1a4aca8afafba67dce7d6b0dabbcb56999bbb2e41d56664f9bcb7?s=96&d=retro&r=g\",\"caption\":\"Rafal Lokuciejewski\"},\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/author\\\/rafal-lokuciejewskiinovex-de\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Data processing scaled up and out with Dask and RAPIDS (3\/3) - inovex GmbH","description":"This blog post shows how high-performance environment for machine learning can be set up using Dask and Rapids.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/","og_locale":"de_DE","og_type":"article","og_title":"Data processing scaled up and out with Dask and RAPIDS (3\/3) - inovex GmbH","og_description":"This blog post shows how high-performance environment for machine learning can be set up using Dask and Rapids.","og_url":"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/","og_site_name":"inovex GmbH","article_publisher":"https:\/\/www.facebook.com\/inovexde","article_published_time":"2021-03-10T09:03:23+00:00","article_modified_time":"2022-09-26T06:50:43+00:00","og_image":[{"width":1920,"height":1080,"url":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/gpgpu-2.png","type":"image\/png"}],"author":"Rafal Lokuciejewski","twitter_card":"summary_large_image","twitter_image":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/gpgpu-2-1024x576.png","twitter_creator":"@inovexgmbh","twitter_site":"@inovexgmbh","twitter_misc":{"Verfasst von":"Rafal Lokuciejewski","Gesch\u00e4tzte Lesezeit":"15\u00a0Minuten","Written by":"Rafal Lokuciejewski"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/#article","isPartOf":{"@id":"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/"},"author":{"name":"Rafal Lokuciejewski","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/person\/4852bd3d70d7e8d5453571bb27fc29c1"},"headline":"Data processing scaled up and out with Dask and RAPIDS (3\/3)","datePublished":"2021-03-10T09:03:23+00:00","dateModified":"2022-09-26T06:50:43+00:00","mainEntityOfPage":{"@id":"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/"},"wordCount":1823,"commentCount":0,"publisher":{"@id":"https:\/\/www.inovex.de\/de\/#organization"},"image":{"@id":"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/#primaryimage"},"thumbnailUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/gpgpu-2.png","articleSection":["Analytics","English Content","General"],"inLanguage":"de","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/","url":"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/","name":"Data processing scaled up and out with Dask and RAPIDS (3\/3) - inovex GmbH","isPartOf":{"@id":"https:\/\/www.inovex.de\/de\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/#primaryimage"},"image":{"@id":"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/#primaryimage"},"thumbnailUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/gpgpu-2.png","datePublished":"2021-03-10T09:03:23+00:00","dateModified":"2022-09-26T06:50:43+00:00","description":"This blog post shows how high-performance environment for machine learning can be set up using Dask and Rapids.","breadcrumb":{"@id":"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/#breadcrumb"},"inLanguage":"de","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/"]}]},{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/#primaryimage","url":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/gpgpu-2.png","contentUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/02\/gpgpu-2.png","width":1920,"height":1080},{"@type":"BreadcrumbList","@id":"https:\/\/www.inovex.de\/de\/blog\/data-processing-scaled-up-and-out-with-dask-and-rapids-running-scaled-data-science-workloads\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.inovex.de\/de\/"},{"@type":"ListItem","position":2,"name":"Data processing scaled up and out with Dask and RAPIDS (3\/3)"}]},{"@type":"WebSite","@id":"https:\/\/www.inovex.de\/de\/#website","url":"https:\/\/www.inovex.de\/de\/","name":"inovex GmbH","description":"","publisher":{"@id":"https:\/\/www.inovex.de\/de\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.inovex.de\/de\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"de"},{"@type":"Organization","@id":"https:\/\/www.inovex.de\/de\/#organization","name":"inovex GmbH","url":"https:\/\/www.inovex.de\/de\/","logo":{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/logo\/image\/","url":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/03\/inovex-logo-16-9-1.png","contentUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/03\/inovex-logo-16-9-1.png","width":1921,"height":1081,"caption":"inovex GmbH"},"image":{"@id":"https:\/\/www.inovex.de\/de\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/inovexde","https:\/\/x.com\/inovexgmbh","https:\/\/www.instagram.com\/inovexlife\/","https:\/\/www.linkedin.com\/company\/inovex","https:\/\/www.youtube.com\/channel\/UC7r66GT14hROB_RQsQBAQUQ"]},{"@type":"Person","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/person\/4852bd3d70d7e8d5453571bb27fc29c1","name":"Rafal Lokuciejewski","image":{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/secure.gravatar.com\/avatar\/8bf762d23ce1a4aca8afafba67dce7d6b0dabbcb56999bbb2e41d56664f9bcb7?s=96&d=retro&r=ge3f981b6ae50c555514691c36d70131a","url":"https:\/\/secure.gravatar.com\/avatar\/8bf762d23ce1a4aca8afafba67dce7d6b0dabbcb56999bbb2e41d56664f9bcb7?s=96&d=retro&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/8bf762d23ce1a4aca8afafba67dce7d6b0dabbcb56999bbb2e41d56664f9bcb7?s=96&d=retro&r=g","caption":"Rafal Lokuciejewski"},"url":"https:\/\/www.inovex.de\/de\/blog\/author\/rafal-lokuciejewskiinovex-de\/"}]}},"_links":{"self":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts\/27814","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/users\/179"}],"replies":[{"embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/comments?post=27814"}],"version-history":[{"count":5,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts\/27814\/revisions"}],"predecessor-version":[{"id":38485,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts\/27814\/revisions\/38485"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/media\/20919"}],"wp:attachment":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/media?parent=27814"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/tags?post=27814"},{"taxonomy":"service","embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/service?post=27814"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/coauthors?post=27814"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}