MapR Cluster Administration Training

The Training sessions are usually held in German. Please contact us if you are interested in Training sessions in English.

This Training provides the knowledge required to develop Big Data Applications on the basis of Apache Spark 2.1.

Participants first learn how to use the Spark shell to load data sets from various sources and formats and analyse them interactively. Building on that, the participants then develop an independent Spark application to process data in the form of data sets and DataFrames locally or in a computing cluster.

The Training is rounded off with an introduction to Spark Streaming for the processing of data streams, GraphFrame for the analysis of graphs and the machine-learning library MLlib.

Agenda:

  • Introduction to the MapR Converged Data Platform (HDFS core components, MapR-FS core components, MapR-FS versus HDFS)

  • Installation preparation of security modes (planning of the service layout, preparation of cluster hardware, testing of nodes)

  • Installation of the MapR Converged Data Platform (MapR Installer, implementation of a manual installation, licensing of the cluster)

  • Verification and testing of the cluster (verification of the cluster status, post-installation benchmark tests, cluster structures)

  • Work with volumes (introduction to volumes, cluster topology, attributes for standard volumes, development of a volume plan, setting up and configuration of volumes)

  • Work with snapshots (introduction to snapshots, working with snapshots, use and management of snapshots)

  • Work with mirrors (introduction to mirrors, working with local mirrors, working with remote mirrors, remote mirrors and disaster recovery)

  • Configuration of user and cluster parameters (management of users and groups, access control expressions (ACEs), user and group quotas, configuration of topology and email notifications)

  • Configuration of cluster access (access to data in the cluster, virtual IP addresses for NFS access, client configuration)

  • Cluster monitoring and management (use of MCS and CLI, MapR Monitoring, reacting to alarms)

  • Disk and node maintenance (adding disks, replacing faulty disks, node maintenance, adding nodes)

  • Troubleshooting of cluster problems (fundamental troubleshooting, tools and utilities)

  • Installation and configuration of YARN (YARN services, YARN job execution flow, YARN configuration)

Target audience: Administrators, System Engineers

Length: 3 days

Would you like a consultation on this on-site Training?

Call Collin Rogowski on +49 172 5673497 or send us an E-Mail. We look forward to advising you.