Wir benutzen Cookies nur für interne Zwecke um den Webshop zu verbessern. Ist das in Ordnung? Ja Nein Für weitere Informationen beachten Sie bitte unsere Datenschutzerklärung. »
Artikelnummer: 118508006

Data Analyst to Data Scientist Ausbildung

Artikelnummer: 118508006

Data Analyst to Data Scientist Ausbildung

364,04 433,21 Inkl. MwSt.

Bestellen Sie diesen einzigartigen E-Learning-Kurs Data Analyst to Data Scientist online, 1 Jahr rund um die Uhr Zugriff auf umfangreiche interaktive Videos, Fortschritte bei der Berichterstellung und beim Testen.

Lesen Sie mehr
Verfügbarkeit:
Auf Lager
Schulungsangebot: IKT-Schulung
  • Award Winning E-learning
  • Tiefpreisgarantie
  • Persönlicher Service durch unser Expertenteam
  • Sicher online oder per Rechnung bezahlen
  • Bestellung und Start innerhalb von 24 Stunden

Data Analyst to Data Scientist E-Learning Ausbildung

Bestellen Sie diesen einzigartigen E-Learning-Kurs Data Analyst to Data Scientist online, 1 Jahr rund um die Uhr Zugriff auf umfangreiche interaktive Videos, Sprache, Fortschrittsüberwachung durch Berichte und Kapiteltests, um das Wissen direkt zu testen.

Data Science Track 1: Data Analyst
Data Science Track 2: Data Wrangle
Data Science Track 3: Data Ops
Data Science Track 4: Data Scientist

Kursinhalt

Data Science Track 1: Data Analyst

Data Architecture Primer

Course: 1 Hour, 4 Minutes

  • Course Overview
  • Data Defined
  • Data Privacy
  • The Data Lifecycle
  • SQL vs. NoSQL
  • Create an Entity Relationship Diagram
  • Implement a SQL Solution
  • Implement a NoSQL Solution
  • Big Data
  • Data Architecture and Governance
  • IT Data System Architecture Types
  • Data Analytics and Reporting
  • Exercise: Implement Data Architecture Best Practices

Data Engineering Fundamentals

Course: 46 Minutes

  • Course Overview
  • Overview of Distributed Systems
  • Batch vs. In-Memory Processing
  • NoSQL Stores
  • Tools for Data Management
  • What is ETL?
  • ETL with Talend Open Studio
  • Data Modeling
  • AI and Machine Learning
  • Data Partitioning
  • Data Engineering
  • Data Reporting
  • Exercise: Create a Data Model

Python for Data Science: Introduction to NumPy for Multi-dimentional Data

Course: 1 Hour

  • Course Overview
  • Introduction to NumPy and the NumPy Ecosystem
  • Array Creation - Part 1
  • Array Creation - Part 2
  • Printing Arrays
  • Basic Array Operations
  • Universal Functions
  • Indexing and Slicing
  • Iterating Over Arrays
  • Reshaping Arrays
  • Exercise: Python NumPy Array Operations

Python for Data Science: Advanced Operations with NumPy Arrays

Course: 1 Hour, 8 Minutes

  • Course Overview
  • Splitting NumPy Arrays
  • Images as Arrays
  • Image Manipulation Using NumPy
  • Views and NumPy Arrays
  • Deep Copies of Arrays
  • Introduction to Index Masks
  • Applying Index Masks
  • Indexing with Boolean Masks
  • Structured Arrays
  • Understanding Array Broadcasting
  • Applying Broadcasting Rules on Array Operations
  • Exercise: NumPy Multi-dimensional Array Operations

Python for Data Science: Introduction to Pandas

Course: 1 Hour, 6 Minutes

  • Course Overview
  • Features of Pandas and the Pandas Ecosystem
  • Introduction to Pandas
  • Work with Pandas
  • Introduction to Data Frames
  • Work with Data Frames
  • Load Data into a Data Frame
  • Add and Delete Data Frame Contents
  • Select Parts of a Data Frame
  • Access Pandas Data Frames
  • Introduction to Multi-Indexing in a Data frame
  • Reshape Data Frames
  • Reshape Data frames Using Stack and Melt Operations
  • Exercise: Pandas for Basic Tabular Data Manipulation

Python for Data Science: Manipulating and Analyzing Data in Pandas DataFrames

Course: 45 Minutes

  • Course Overview
  • Iterating Over the Contents of a Data Frame
  • Exporting a Data Frame
  • Sorting
  • Handling Missing Data
  • Grouping with a Multi-Index
  • Merging Data Frames
  • Applying Join Operations on Data Frames
  • Pandas and Relational Databases
  • Exercise: Pandas for Advanced Data Manipulation

R for Data Science: Data Structures

Course: 52 Minutes

  • Course Overview
  • Creating Vectors
  • Manipulating Vectors
  • Sorting Vectors
  • Using Lists
  • Creating Matrices
  • Matrix Operations
  • Creating Factors
  • Creating Data Frames
  • Data Frame Operations
  • Exercise: Creating and Using a Data Frame

R for Data Science: Importing and Exporting Data

Course: 34 Minutes

  • Course Overview
  • Reading from CSV
  • Reading from Excel
  • Reading from HTML
  • Exporting to CSV
  • Exporting to Excel
  • Exporting to HTML
  • Exercise: Reading and Writing Data

R for Data Science: Data Exploration

Course: 41 Minutes

  • Course Overview
  • Creating dplyr Tables
  • Selecting Subsets
  • Filtering Tabular Data
  • Piping Data
  • Mutating Data
  • Summarizing Data
  • Combining Datasets
  • Grouping Data
  • Exercise: Querying Data

R for Data Science: Regression Methods

Course: 37 Minutes

  • Course Overview
  • Linear Data Preparation
  • Creating Linear Models
  • Interpreting Model Output
  • Using Linear Prediction
  • Logistic Data Preparation
  • Using glm
  • Exercise: Creating a Linear Model

R for Data Science: Classification & Clustering

Course: 39 Minutes

  • Course Overview
  • Preparing Data for Classification
  • Using rpart
  • Using ctree
  • Preparing Data for Clustering
  • Using K-Means Clustering
  • Using Hierarchical Clustering
  • Exercise: Creating a Decision Tree

Data Science Statistics: Simple Descriptive Statistics

Course: 1 Hour, 11 Minutes

  • Course Overview
  • Descriptive and Inferential Statistics
  • Population vs. Sample
  • Probability vs. Non-Probability Sampling
  • Mean
  • Median
  • Mode
  • IQR
  • Variance
  • Exercise: Using Descriptive Statistics

Data Science Statistics: Common Approaches to Sampling Data

Course: 47 Minutes

  • Course Overview
  • Terms in Sampling
  • Sampling Bias
  • Simple Random Sampling
  • Systematic Random Samplin
  • Stratified Sampling
  • Non-Probability Sampling
  • Exercise: Efficient and Correct Sampling

Data Science Statistics: Inferential Statistics

Course: 1 Hour, 2 Minutes

  • Course Overview
  • Gaussian Distribution
  • Inferential Statistics and Hypothesis Testing
  • Simplified Example of Hypothesis Testing
  • T-tests
  • Skewness and Kurtosis
  • Correlation and Autocorrelation
  • Introducing Linear Regression
  • Overfitting and Goodness-of-Fit
  • Exercise: Basic Inferential Statistics

Accessing Data with Spark: An Introduction to Spark

Course: 1 Hour, 7 Minutes

  • Course Overview
  • Introduction to Spark and Hadoop
  • Resilient Distributed Datasets (RDDs)
  • RDD Operations
  • Spark Data Frames
  • Spark Architecture
  • Spark Installation
  • Working with RDDs
  • Creating Data Frames from RDDs
  • Contents of a Data Frame
  • The SQL Context
  • The map() Function of an RDD
  • Accessing the Contents of a Data Frame
  • Data Frames in Spark and Pandas
  • Exercise: Working with Spark

Getting Started with Hadoop: Fundamentals & MapReduce

Course: 1 Hour, 4 Minutes

  • Course Overview
  • An Introduction to Big Data
  • Building Systems to Scale with Data
  • A Quick Overview of Hadoop
  • MapReduce Overview
  • The Map Phase of a MapReduce
  • The Shuffle and Reduce Phases
  • Exercise: Fundamentals of Hadoop and MapReduce

Getting Started with Hadoop: Developing a Basic MapReduce Application

Course: 1 Hour, 14 Minutes

  • Course Overview
  • Provisioning a Hadoop Cluster on the Cloud
  • Browsing the Hadoop Web Applications
  • Creating a MapReduce project
  • Coding the Map Phase
  • Coding the Reduce Phase
  • Defining the Driver Program
  • Building the Application
  • Executing the MapReduce Application
  • Exercise: Developing a Basic MapReduce Application

Hadoop HDFS: Introduction

Course: 1 Hour, 15 Minutes

  • Course Overview
  • Scaling Datasets
  • Horizontal Scaling for Big Data
  • Distributed Clusters and Horizontal Scaling
  • Overview of HDFS
  • HDFS Architectures
  • MapReduce for HDFS
  • YARN for HDFS
  • The Mechanism of Resource Allocation in Hadoop
  • Apache Zookeeper for HDFS
  • The Hadoop Ecosystem
  • Exercise: An Introduction to HDFS

Hadoop HDFS: Introduction to the Shell

Course: 53 Minutes

  • Course Overview
  • Creating a Hadoop Cluster on the Google Cloud
  • Exploring Hadoop Clusters
  • The YARN Cluster Manager UI
  • The HDFS Name Node UI
  • Browsing the Packaged Hadoop Tools
  • Configuring HDFS
  • The HDFS Shells
  • Exercise: Introduction to the HDFS Shell

Hadoop HDFS: Working with Files

Course: 48 Minutes

  • Course Overview
  • Basic Directory Commands in HDFS
  • Using the copy From Local Command in HDFS
  • Using the put Command in HDFS
  • Using the copy To Local Command in HDFS
  • Retrieving files from HDFS
  • Append and Delete Operations in HDFS
  • Exercise: Working with Files on HDFS

Hadoop HDFS: File Permissions

Course: 49 Minutes

  • Course Overview
  • The HDFS count and du Commands
  • Viewing and Setting File Permissions in HDFS
  • Applying Permissions Recursively in HDFS
  • An Introduction to Bash Scripting
  • Scripting HDFS Operations
  • Exploring the HDFS Name Node UI
  • Cleanup Operations in HDFS
  • Exercise: File Permissions on HDFS

Data Silos, Lakes, & Streams: Introduction

Course: 1 Hour, 20 Minutes

  • Course Overview
  • Data Silos
  • Data Lakes
  • Characteristics of Data Lakes
  • Data Lake Architecture, Features, and Challenges
  • Data Warehouses
  • Data Warehouses vs. Data Lakes
  • Data Streams
  • Migrating Data to AWS
  • Data Lakes on AWS
  • Working with Data Lakes on AWS
  • Exercise: Data Silos, Lakes, and Streams

Data Silos, Lakes, and Streams: Data Lakes on AWS

Course: 1 Hour, 10 Minutes

  • Course Overview
  • Create a Role for the AWS Glue Service
  • Upload Data to S
  • Explore the Glue Web Console
  • Manually Create Glue Tables
  • Query the Data Lake Using Amazon Athena
  • Configure and Run Glue Crawlers
  • Access Data in Crawled Tables
  • Crawl Multiple CSV Files in the Same Folder Path
  • Merge Data in Multiple Files in the Same Folder Path
  • Work with Files Having the Exact Same Schema
  • Exercise: Data Lakes on AWS with S3 and Glue

Data Silos, Lakes, & Streams: Sources, Visualizations, & ETL Operations

Course: 1 Hour, 29 Minutes

  • Course Overview
  • Set Up a Redshift Cluster
  • Create Tables and Load Data From S
  • Establish a JDBC Connection to Redshift
  • Crawl Redshift Using a JDBC Connection
  • Crawl DynamoDB
  • Configure Quick Sight to Visualize Data
  • Visualize Data in Quick Sight
  • Configure a Job to Perform Extract, Transform, Load
  • Execute an ETL Operation in Glue
  • Perform ETL to Back Up Redshift Data in S3 Buckets
  • Perform ETL to Back Up DynamoDB Data in S3 Buckets
  • Exercise: Multiple Sources, Visualizations, and ETL

Data Analysis Application

Course: 1 Hour, 25 Minutes

  • Course Overview
  • Install and Configure Anaconda Python
  • Install R Using Anaconda
  • Use Jupyter Notebook
  • Import and Export Data in Python
  • Import and Export Data in R
  • Deal with Missing Data in R
  • Transform Data in R
  • Work with Numpy
  • Work with Pandas
  • Mean, Median, and Mode in R
  • Analyze Data with Pandas
  • Plot Data in R
  • Visualize Data in Python
  • Exercise: Perform Data Analysis

Data Science Track 2: Data Wrangler

Data Wrangling with Pandas: Working with Series & DataFrames

Course: 1 Hour, 11 Minutes

  • Course Overview
  • Installing Pandas
  • Pandas Series Objects
  • Operations on Series
  • Appending and Sorting Series Values
  • Pandas DataFrames
  • Indexing Operations with DataFrames
  • Missing Data
  • Column Aggregations
  • Statistical Operations
  • Exercise: Operations on Series and DataFrames

Data Wrangling with Pandas: Visualizations and Time-Series Data

Course: 1 Hour, 29 Minutes

  • Course Overview
  • Pandas and Matplotlib for Visualizations
  • Pie Charts, Box Plots, and Scatter Plots
  • Time-Series Data
  • Deltas and Percentage Change Calculations
  • Time Deltas and Date Ranges
  • Mismatched DataFrames and Missing Data
  • Working with String Data
  • Advanced Operations on Strings
  • Applying Functions on Series
  • Transforming Data With User-Defined Functions
  • Applying Functions on DataFrames
  • Exercise: Plot Charts and Transform Column Values

Data Wrangling with Pandas: Advanced Features

Course: 1 Hour, 12 Minutes

  • Course Overview
  • Grouping and Aggregations
  • MultiIndex DataFrames
  • Grouping and Aggregations with MultiIndex DataFrames
  • General Aggregation Functions
  • Filtering
  • Masking Column Values
  • Working with Duplicates
  • Working with Categorical Data
  • Filtering, Adding, and Removing Categories
  • Reindexing
  • Exercise: Filtering, Duplicates and Categorical Data5

Data Wrangler 4: Cleaning Data in R

Course: 1 Hour, 3 Minutes

  • Course Overview
  • Types of Unclean Data
  • Data Quality
  • Downloading JSON Data
  • Excel Sheets
  • Reading Dirty CSVs
  • Querying Relational Databases
  • Joining Tabular Data
  • Spreading Data
  • Summarizing Data
  • Imputing Data
  • Extracting Matches
  • Exercise: Wrangling Data

Data Tools: Technology Landscape & Tools for Data Management

Course: 27 Minutes

  • Course Overview
  • Technology Landscape and Tools
  • Tool Comparison
  • Machine Learning in Data Analytics
  • Machine Learning Tools
  • Machine Learning Implementation
  • Python and R for Data Management
  • Cloud and Machine Learning
  • Exercise: Implement Machine Learning on Scikit-learn

Data Tools: Machine Learning & Deep Learning in the Cloud

Course: 23 Minutes

  • Course Overview
  • Microsoft Machine Learning Toolkit
  • AWS and Machine Learning
  • Spark Machine Learning Capabilities
  • Deep Learning Frameworks
  • Deep Learning Implementation
  • Data Mining and Analytical Tools
  • KNIME Capabilities
  • Exercise: Implement Deep Learning

Trifacta for Data Wrangling: Wrangling Data

Course: 50 Minutes

  • Course Overview
  • Standardizing Data
  • Formatting Dates
  • Filtering Rows
  • Replacing Values
  • Counting Matches
  • Splitting Columns
  • Merging Columns
  • Extracting Data
  • Conditional Aggregation
  • Reshaping Data
  • Joining Data
  • Exercise: Wrangling Data

MongoDB for Data Wrangling: Querying

Course: 1 Hour, 8 Minutes

  • Course Overview
  • Introduction to PyMongo
  • Document Structure
  • CRUD Operations
  • ObjectID and Timestamp
  • Query Operations
  • Projection Queries
  • Comparison Operators
  • Element Query Operators
  • The Regex Operator
  • Using the Size and All Operators
  • Text Search
  • Using mongoimport
  • Using mongoexport
  • Exercise: Performing a Query

MongoDB for Data Wrangling: Aggregation

Course: 51 Minutes

  • Course Overview
  • Aggregation Framework
  • Using Group
  • Using Match
  • Using Project
  • Using Limit and Sort
  • Using Unwind
  • Using Lookup
  • Using Indexes
  • Using Geospatial Indexes
  • Exercise: Performing an Aggregate Query

Getting Started with Hive: Introduction

Course: 56 Minutes

  • Course Overview
  • Hive as a Data Warehouse
  • Overview of Relational Databases
  • OLTP and OLAP
  • Hive and the Hadoop Ecosystem
  • Hive Server and The Metastore
  • Hive on Cloud Computing Platforms
  • Data Types in Hive
  • Data and Tables in Hive
  • Exercise: Introduction to Hive

Getting Started with Hive: Loading and Querying Data

Course: 1 Hour, 20 Minutes

  • Course Overview
  • Setting up a Hadoop Cluster on the Google Cloud
  • Creating a Hive Table
  • Running Simple Queries in Hive
  • Executing Hive Queries from the Shell
  • Joining Tables in Hive
  • Exploring the Hive Warehouse
  • External Tables in Hive
  • Modifying Tables in Hive
  • Temporary Tables in Hive
  • Loading Data into Tables in Hive
  • Populating Multiple Tables in Hive
  • Exercise: Loading and Querying Data in Hive

Getting Started with Hive: Viewing and Querying Complex Data

Course: 1 Hour, 14 Minutes

  • Course Overview
  • The Array Data Type in Hive
  • The Map Data Type in Hive
  • The Struct Type in Hive
  • The explode and posexplode Functions in Hive
  • Lateral Views in Hive
  • Multiple Lateral Views in Hive
  • Set Operations in Hive
  • The IN and EXISTS clauses in Hive
  • Creating and Populating Tables in Hive
  • Views in Hive6
  • Exercise: Viewing and Querying Complex Data

Getting Started with Hive: Optimizing Query Executions

Course: 43 Minutes

  • Course Overview
  • Hive Queries as MapReduce Jobs
  • Techniques to Improve Query Performance in Hive
  • Partitioning Tables in Hive
  • Bucketing Tables in Hive
  • Structuring Join Queries in Hive
  • Exercise: Optimizing Query Execution in Hive

Getting Started with Hive: Optimizing Query Executions with Partitioning

Course: 1 Hour, 1 Minute

  • Course Overview
  • Setting up a Hadoop Cluster on the Google Cloud
  • Creating a Partitioned Table in Hive
  • Working with Partitions in Hive
  • Populating Partitions in Hive
  • Partitioning External Tables in Hive
  • Modifying Partitions in Hive
  • Dynamic Partitions in Hive
  • Using Multiple Columns for Partitioning in Hive
  • Exercise: Optimize Executions with Partitioning

Getting Started with Hive: Bucketing & Window Functions

Course: 1 Hour, 4 Minutes

  • Course Overview
  • Apply Bucketing for a Table in Hive
  • Using Bucketing and Partitioning Together in Hive
  • Sorting a Bucket's Contents in Hive
  • Sampling a Table in Hive
  • Joining Multiple Tables in Hive
  • Introducing Window Functions in Hive
  • Windows Functions with Partitions in Hive
  • Exercise: Bucketing and Window Functions in Hive

Getting Started with Hadoop: Filtering Data Using MapReduce

Course: 59 Minutes

  • Course Overview
  • Counting the Data Points in Each Category
  • The Reducer and Driver Programs
  • Building and Executing the Application
  • A Simple Filter Using MapReduce
  • Executing and Examining the Output
  • Extracting the Unique Values in a Column
  • Viewing the Distinct Values Extracted
  • Exercise: Filtering Data Using MapReduce

Getting Started with Hadoop: MapReduce Applications With Combiners

Course: 1 Hour, 24 Minutes

  • Course Overview
  • Combiners in MapReduce
  • Revisiting MapReduce
  • Working with Combiners
  • Using Combiners for Calculating Averages
  • Creating a Project to Calculate Averages
  • Coding the Map and Reduce Phases
  • Configure the Application in the Driver
  • Executing the Application and Examining the Output
  • Adding a Combiner to a MapReduce Application
  • Conveying a Pair of Numbers from the Mapper
  • Running the Fixed Application
  • Exercise: Optimizing MapReduce With Combiners

Getting Started with Hadoop: Advanced Operations Using MapReduce

Course: 49 Minutes

  • Course Overview
  • Defining a User-Defined Type for a PriorityQueue
  • Implementing a PriorityQueue in a Mapper
  • Using a PriorityQueue in a Reducer
  • Running and Verifying the Results
  • Building an Inverted Index - Map Phase
  • Building an Inverted Index - Reduce Phase
  • Executing the Application and Viewing the Index
  • Exercise: Advanced Operations Using MapReduce

Accessing Data with Spark: Data Analysis Using the Spark DataFrame API

Course: 1 Hour, 12 Minutes

  • Course Overview
  • Performance Improvements in Spark
  • Broadcast Variables and Accumulators
  • Loading Data into a DataFrame
  • Sampling the Contents of a DataFrame
  • Grouping and Aggregations
  • Visualizing Data in a DataFrame
  • Trimming and Cleaning Data
  • User-Defined Functions and DataFrames
  • Combining Filters, Aggregations, and Sorting
  • Using Broadcast Variables
  • Using Accumulators
  • Exporting DataFrame Contents
  • Custom Accumulators
  • Join Operations4
  • Exercise: Data Analysis Using the DataFrame API4

Accessing Data with Spark: Data Analysis using Spark SQL

Course: 55 Minutes

  • Course Overview
  • The Spark Catalyst Optimizer
  • Introduction to Spark SQL
  • Preparing Data for Analysis
  • Running SQL Queries
  • Inferred and Explicit Schemas
  • Windowing in Spark
  • Applying Window Functions
  • Exercise: Data Analysis Using Spark SQL

Data Lake: Framework & Design Implementation

Course: 34 Minutes

  • Course Overview
  • Data Lakes and Data Warehouses
  • Data Lake Selection Criteria
  • Data Lake and Data Democratization
  • Data Lake Design Principles
  • AWS Data Lake Architecture
  • Implement AWS Data Store
  • Data Lake For On-Premise and Multi-Cloud
  • Data Processing Frameworks for Data Lake
  • Exercise: Implement AWS Data Store

Data Lake: Architectures & Data Management Principles

Course: 35 Minutes

  • Course Overview
  • Real-Time Big Data Architectures
  • Data Lake Reference Architecture
  • Data Ingestion and File Formats
  • Ingestion Using Sqoop
  • Data Processing Strategies
  • Deriving Value from Data Lakes
  • Data Life Cycle
  • S3 and Glacier
  • Exercise: Ingest Data and Implement Archival Policy

Data Architecture - Deep Dive: Design & Implementation

Course: 36 Minutes

  • Course Overview
  • Data Complexity Management Strategies
  • Data Modeling Process
  • Distributed Data Management
  • Partitioning Methods and Criteria
  • MongoDB Partitioning
  • Hybrid Data Architectures
  • Implement Directed Acyclic Graph
  • CAP Theorem
  • Batch vs. Streaming
  • Read and Write Concerns
  • Exercise: Implement Serverless Architecture

Data Architecture - Deep Dive: Microservices & Serverless Computing

Course: 26 Minutes

  • Course Overview
  • Microservices and Data
  • Serverless and Lambda Architecture
  • Lambda Implementation
  • Cluster Benefits
  • Data Architecture Types
  • Data Discovery Process
  • Data Risk Types
  • Data POC
  • Exercise: Implement Lambda Architecture

Data Science Track 3: Data Ops

Deploying Data Tools: Data Science Tools

Course: 48 Minutes

  • Course Overview
  • Data Science Platform
  • Challenges of Deploying Data Science Tools
  • Considerations for Data Science Tools
  • Data Science Workflow
  • Data Science Analytic Tools
  • Data Science Visualization Tools
  • Data Science Database Tools
  • Benefits of Deploying Cloud-Based Tools
  • Challenges of Deploying Cloud-Based Tools
  • What is DevOps
  • DevOps for Data Science
  • Exercise: Identifying Uses of Data Science Tools

Delivering Dashboards: Management Patterns

Course: 34 Minutes

  • Course Overview
  • Analytical Visualization
  • Dashboard Types
  • Data Management
  • Dashboard Components
  • Dashboard Best Practices
  • Dashboard Using ELK
  • Dashboard Using Power BI
  • Chart Selection Criteria
  • Leaderboards and Scorecards
  • Scorecard Types
  • Exercise: Create Dashboards with PowerBI and ELK

Delivering Dashboards: Exploration & Analytics

Course: 31 Minutes

  • Course Overview
  • Data Exploration Using Charts
  • Analytical Visualization Tools
  • Bar and Line Charts
  • Dashboarding with Kibana
  • Dashboard Sharing with Kibana
  • Dashboarding with Tableau
  • Dashboarding with Qlikview
  • Data Ingest and Dashboards
  • Dashboard Patterns
  • Monitoring Dashboards
  • Exercise: Create Dashboards Using Kibana and Tableau

Cloud Data Architecture: DevOps & Containerization

Course: 45 Minutes

  • Course Overview
  • Containerization on the Cloud
  • Benefits of Containers
  • Serverless Computing
  • DevOps in the Cloud
  • AWS OpsWorks
  • Storage Classification
  • Cloud and Machine Learning
  • Cloud and BI Analytics
  • Exercise: Containerization and Serverless Computing

Compliance Issues and Strategies: Data Compliance

Course: 44 Minutes

  • Course Overview
  • Data Compliance Issues
  • Data Regulations
  • The Importance of Global Standards
  • Risk and Company Standards
  • Myths and Facts of Data Compliance
  • Compliance Training for Users
  • Compliance Training for Management
  • The Benefits of a Data Compliance Program
  • Elements of a Good Compliance Strategy
  • Building a Compliance Strategy
  • Reporting and Response Procedures
  • Exercise: Explain the Importance of Data Compliance

Implementing Governance Strategies

Course: 46 Minutes

  • Course Overview
  • Governance and its Relationship with Big Data
  • Why Big Data Requires Governance
  • Requirements for Big Data Governance
  • Why is Big Data Different?
  • Identifying Data
  • Identifying Stakeholders
  • Cloud Technologies and Data Governance
  • Designing a Data Governance Process
  • Managing a Data Governance Strategy
  • Monitoring a Data Governance Strategy
  • Maintaining a Data Governance Strategy
  • Exercise: Defining Data Governance Strategies

Data Access & Governance Policies: Data Access Oversight and IAM

Course: 59 Minutes

  • Course Overview
  • Data Access Governance
  • Risk and Data Safety Compliance
  • Data Access Patterns
  • Data Breach Prevention
  • Least Privilege
  • Assign and View Effective File System Permissions
  • Identity and Access Management
  • Create an AWS IAM User and Group
  • Assign AWS IAM Group Permissions
  • Vulnerability Assessments
  • Implement Effective Security Controls
  • Exercise: Implement Data Access Governance Solutions

Data Access & Governance Policies: Data Classification, Encryption, and Monitoring

Course: 1 Hour, 19 Minutes

  • Course Overview
  • Data Classification
  • Classify Data Using Microsoft FSRM
  • Data Encryption
  • Encrypt Data at Rest
  • Encrypt Data in Motion
  • Implement Security Compliance Checking
  • Examine Data Access Trends
  • Data Access Monitoring Solutions
  • Logging, Auditing, and Data Analytics
  • Configure a Custom Filtered Log View
  • Enable Windows Data Access Auditing
  • Exercise: Implement Data Confidentiality

Streaming Data Architectures: An Introduction to Streaming Data

Course: 51 Minutes

  • Course Overview
  • Introduction to Streaming data
  • The Stream Processing Model
  • The Message Transport
  • Stream Processing with RDDs
  • Structured Streaming for Continuous Applications
  • Streaming vs Structured Streaming
  • Triggers and Output Modes
  • Exercise: Working with Streaming Data

Streaming Data Architectures: Processing Streaming Data

Course: 53 Minutes

  • Course Overview
  • PySpark Setup
  • Setting Up a Socket Stream with Netcat
  • The Update Output Mode
  • Using a File Input Stream
  • The Append Output Mode
  • The Complete Output Mode
  • Aggregations on Streaming Data
  • SQL Operations on Streaming Data
  • User-Defined Functions (UDFs)
  • Exercise: Processing Streaming Data

Scalable Data Architectures: Introduction

Course: 53 Minutes

  • Course Overview
  • Scalable Architectures with Distributed Computing
  • Introducing Data Warehouses
  • Contrasting Warehouses with Relational Databases
  • Data Warehouses for Analytical Processing
  • Data Warehouse Architectural Components
  • Amazon Redshift - A Data Warehouse on the Cloud
  • Exercise: Scalable Data Architectures

Scalable Data Architectures: Introduction to Amazon Redshift

Course: 55 Minutes

  • Course Overview3
  • Provisioning a Redshift Cluster Using Quick Launch
  • Creating a Redshift Cluster With Additional Detail
  • Exploring the Redshift Configs and Metrics
  • Attaching an IAM Role to a Redshift Cluster
  • Creating an AWS User to Work With Redshift
  • Installing and Configuring the AWS CLI8
  • Running Queries from the Redshift Query Editor
  • Exercise: An Introduction to Amazon Redshift

Scalable Data Architectures: Working with Amazon Redshift & QuickSight

Course: 1 Hour, 18 Minutes

  • Course Overview
  • Loading Data from Amazon S3 to a Redshift Cluster
  • Running Queries and Evaluating Their Execution
  • Querying a Redshift Cluster Using a SQL client
  • Working with Automated Snapshots
  • Restoring Tables from a Snapshot
  • Horizontal Scaling of a Redshift Cluster
  • Vertical and Horizontal Scaling of a Cluster
  • Configuring Access from Quick Sight to Redshift
  • Loading a Dataset to Quick Sight
  • Creating Visualizations with Quick Sight
  • Exercise: Working with Redshift and Quick Sight

Building Data Pipelines

Course: 1 Hour, 10 Minutes

  • Course Overview
  • Data Pipelines Overview
  • Traditional ETL Pipeline with Batch Processing
  • Data Pipeline Tools
  • Setup and Install Airflow
  • Apache Airflow
  • Airflow Workflows
  • Airflow Tasks
  • Airflow Dependencies
  • ETL Pipeline with Airflow
  • Automated Pipeline without ETL
  • Airflow Command Line Testing
  • Exercise: Using Apache Airflow

Data Pipeline: Process Implementation Using Tableau & AWS

Course: 39 Minutes

  • Course Overview
  • Data Pipeline
  • Data Pipeline Processes
  • Data Pipeline Stages
  • Data Pipeline Technologies
  • Data Source Types
  • Scheduled Data Pipeline
  • Tableau Server and Utilities
  • Data Pipeline Using Tableau
  • Data Pipeline on AWS
  • Exercise: Build Data Pipelines with Tableau

Data Pipeline: Using Frameworks for Advanced Data Management

Course: 33 Minutes

  • Course Overview
  • Celery and Luigi
  • Data Pipeline with Python Luigi
  • Working with Dask Library
  • Dask Arrays
  • Data Exploration and Visualization Frameworks
  • Spark and Tableau
  • Streaming Data Visualization with Python
  • Data Pipeline Open Source Tools
  • Exercise: Implement Data Pipelines with Luigi

Data Sources: Integration

Course: 40 Minutes

  • Course Overview
  • Elements of IoT Solutions
  • Service Categories in IoT
  • IoT Capabilities and Maturity Model
  • IoT Design Principles
  • IoT Cloud Architectures
  • MQTT and XXMP
  • IoT Controllers
  • IoT Data Management
  • Securing IoT
  • Exercise: Generating Data Streams

Data Sources: Implementing Edge on the Cloud

Course: 31 Minutes

  • Course Overview
  • AWS IoT Greengrass
  • GCP IoT Edge
  • AWS IoT over WebSockets
  • IoT Device Simulator
  • Generating Streams of Data Using MQTT
  • Exercise: Working with IoT Device Simulators

Securing Big Data Streams

Course: 1 Hour, 3 Minutes

  • Course Overview
  • Big Data Security Concerns
  • Streaming Data Security Concerns
  • NoSQL Database Security Concerns
  • Distributed Processing Security Risks
  • Data Mining and Analytics Privacy Flaws
  • End-Point Device Tampering Risks
  • Secure Big Data
  • Secure Data Streams
  • Secure Data In Motion
  • End-Point Input Validation and Filtering
  • Secure Data at Rest with Symmetric Ciphers
  • Exercise: Securing Big Data Streams

Harnessing Data Volume & Velocity: Big Data to Smart Data

Course: 39 Minutes

  • Course Overview
  • Comparing Big Data and Smart Data
  • Smart Data and Edge Technologies
  • Big Data to Smart Data Formation
  • Smart Data and Smart Processes
  • Smart Data Use Cases
  • Smart Data Life Cycle
  • Big Data to Smart Data Using k-NN
  • Smart Data Frameworks
  • Smart Data to Business
  • Clustering Smart Data
  • Smart Data Integration
  • Exercise: Transform Big Data to Smart Data

Data Rollbacks: Transaction Rollbacks & Their Impact

Course: 36 Minutes

  • Course Overview
  • Rollback Process
  • State of Transactions
  • Transaction Types
  • SQL Transaction Management
  • Transaction Log Operations
  • Deadlock Management
  • SQL Server Rollback Mechanism
  • SQL Server Rollback Mechanism Implementation
  • Exercise: Implement Transactions with SQL Server

Data Rollbacks: Transaction Management & Rollbacks in NoSQL

Course: 29 Minutes

  • Course Overview
  • NoSQL and SQL Transaction Management
  • MongoDB Transactions
  • Manage Multi-Document Transactions in MongoDB
  • Change Data Capture
  • Change Stream in MongoDB
  • MongoDB Change Stream Implementation
  • Exercise: MongoDB Transactions and Change Streams

Data Science Track 4: Data Scientist

Balancing the Four Vs of Data: The Four Vs of Data

Course: 40 Minutes

  • Course Overview
  • Overview of the Four Vs
  • The Importance of Volume
  • The Importance of Variety
  • The Importance of Velocity
  • The Importance of Veracity
  • The Relationship Between the Four Vs
  • Variety and Data Structure
  • Validity and Volatility
  • Finding Balance in the Four Vs
  • Use Cases
  • Extracting Value from the Four Vs
  • Exercise: Describe the Four Vs of Big Data

Data Driven Organizations

Course: 1 Hour, 15 Minutes

  • Course Overview
  • Data Driven Organizations
  • Decision Making
  • Analytic Maturity
  • Analytic Roles
  • Data Source Priority
  • Facets of Data Quality
  • Power BI Data Visualization
  • Missing Data
  • Duplicate Data
  • Truncated Data
  • Data Provenance
  • Exercise: Use Informatica Data Quality

Raw Data to Insights: Data Ingestion & Statistical Analysis

Course: 54 Minutes

  • Course Overview
  • Statistical Analysis
  • Data Correction
  • Outlier Detection
  • Data Architecture Pattern
  • Data Ingestion Tools
  • Kafka and Apache NiFi
  • Apache Sqoop Ingest
  • Ingest Using WaveFront
  • Exercise: Detecting Outliers and Ingesting Data

Raw Data to Insights: Data Management & Decision Making

Course: 57 Minutes

  • Course Overview
  • Data-driven Decision Making Framework
  • Loading Data into R
  • Preparing Data
  • Data Correction Approach
  • Data Correction Using Simple Transformation
  • Data Correction Using Deductive Correction
  • Distributed Data Management
  • Data Analytics
  • Data Analytics Using R
  • Predictive Modeling
  • Exercise: Correcting and Modelling Data

Tableau Desktop: Real Time Dashboards

Course: 1 Hour, 8 Minutes

  • Course Overview
  • Introducing Real Time Dashboards
  • Creating Real Time Dashboards with Tableau
  • Build a Tableau Dashboard
  • Real Time Dashboard Updates in Tableau
  • Organizing Your Tableau Dashboard
  • Formatting Your Tableau Dashboard
  • Interactive Tableau Dashboard
  • Tableau Dashboard Starters
  • Tableau Dashboard Extensions
  • Tableau Dashboards and Story Points
  • Sharing your Tableau Dashboard
  • Exercise: Creating a Tableau Dashboard Starter

Storytelling with Data: Introduction

Course: 47 Minutes

  • Course Overview
  • Storytelling Process
  • Interpreting Context
  • Analysis Types
  • Who, What, and How of Storytelling
  • Visualization for Storytelling
  • Graphical Tools for Data Elaboration
  • Storytelling Scenarios
  • Storyboarding
  • Exercise: Visualization and Graphical Tool

Storytelling with Data: Tableau & PowerBI

Course: 57 Minutes

  • Course Overview
  • Visual Selection
  • Slopegraphs
  • Bar Charts and Types of Bar Charts
  • Clutter and Clutter Elimination
  • Gestalt Principle
  • Story Design Best Practices
  • Tools for Storytelling
  • Decluttering
  • Crafting Visual Data
  • Visual Design Concerns
  • Storytelling with Power BI
  • Model Visual and Tableau
  • Exercise: Storytelling with Power BI

Python for Data Science: Basic Data Visualization Using Seaborn

Course: 1 Hour, 7 Minutes

  • Course Overview
  • Introduction to Seaborn
  • Install Seaborn
  • Simple Univariate Distributions
  • Configure Univariate Distribution Plots
  • Simple Bivariate Distributions
  • Explore Different Types of Bivariate Distributions
  • Analyze Multiple Variable Pairs
  • Regression Plots
  • Themes and Styles in Seaborn
  • Exercise: Basic Data Visualization Using Seaborn

Python for Data Science: Advanced Data Visualization Using Seaborn

Course: 1 Hour, 4 Minutes

  • Course Overview
  • Searching for Patterns in a Dataset
  • Configuring Plot Aesthetics
  • Normal Distribution and Outliers
  • Distributions Within Categories - Part 1
  • Distributions Within Categories - Part 2
  • Analyzing Categories with Facet Grids - Part 1
  • Analyzing Categories with Facet Grids - Part 2
  • Introducing Color Palettes
  • Using Color Palette8
  • Exercise: Advanced Data Visualization Using Seaborn

Data Science Statistics: Using Python to Compute & Visualize Statistics

Course: 1 Hour, 16 Minutes

  • Course Overview
  • An Introduction to Matplotlib
  • Analyzing Data Using NumPy and Pandas
  • Visualizing Univariate and Bivariate Distributions
  • Summary Statistics Using Native Python Functions
  • Summary Statistics Using NumPy
  • Summary Statistics Using the SciPy Library
  • Correlation and Covariance
  • Z-score
  • Exercise: Compute and Visualize Statistics6 MinutesCompletedActions

R for Data Science: Data Visualization

Course: 33 Minutes

  • Course Overview
  • Using Scatter Plots
  • Using Line Graphs
  • Using Bar Charts
  • Using Box and Whisker Plots
  • Using Histograms
  • Using a Bubble Plot
  • Exercise: Data Visualization

Advanced Visualizations & Dashboards: Visualization Using Python

Course: 38 Minutes

  • Course Overview
  • Relevance of Data Visualization for Business
  • Libraries for Data Visualization in Python
  • Python Data Visualization Environment Configuration
  • Matplotlib Libraries for Visualization
  • Bar Chart Using ggplot
  • Bokeh and Pygal
  • Select Visualization Libraries
  • Interactive Graphs and Image Files
  • Plot Graphs
  • Multiple Lines in Graphs
  • Exercise: Create Line Charts with Pygal

Advanced Visualizations & Dashboards: Visualization Using R

Course: 35 Minutes

  • Course Overview
  • Chart Types
  • Stacked Bar Plot
  • Animate Plots with Matplotlib
  • Plotting in Jupyter Notebook
  • Graphics in R
  • Heat Map and Scatter Plot in R
  • Correlogram and Area Chart in R
  • ggplot2 Capabilities
  • Customize ggplot2 Graphs
  • Exercise: Creating Heat Maps and Scatter Plots

Powering Recommendation Engines: Recommendation Engines

Course: 1 Hour, 5 Minutes

  • Course Overview
  • Describing Recommendation Engines
  • Comparing the Types of Recommendation Engines
  • Collecting and Manipulating Data
  • Manipulating Data in R
  • Describing Similarity and Neighborhoods
  • Creating a Recommendation Engine
  • Recommending Another Item
  • Finding Items to Recommend
  • Recommending Items Based on Other Items
  • Evaluating a Recommendation System
  • Validating a Recommendation System
  • Exercise: Creating a Recommendation Engine

Data Insights, Anomalies, & Verification: Handling Anomalies

Course: 46 Minutes

  • Course Overview
  • Data and Anomaly Sources
  • Decomposition and Forecasting
  • Examine Data Using Randomization Tests
  • Anomaly Detection
  • Anomaly Detection Techniques
  • Anomaly Detection with scikit-learn
  • Anomaly Detection Tools
  • Anomaly Detection Rules
  • Exercise: Detecting Anomalies

Data Insights, Anomalies, & Verification: Machine Learning & Visualization Tools

Course: 51 Minutes

  • Course Overview
  • Machine Learning Anomaly Detection Techniques
  • Comparing Anomaly Detection Algorithms
  • Anomaly Detection Using R5
  • Online Anomaly Detection Components
  • Online Anomaly Detection Approaches
  • Anomaly Detection Use Cases
  • Anomaly Detection with Visualization Tools
  • Anomaly Detection with Mathematical Approaches
  • Cluster-Based Anomaly Detection
  • Exercise: Detecting Anomalies

Data Science Statistics: Applied Inferential Statistics

Course: 1 Hour, 19 Minutes

  • Course Overview
  • The One-Sample T-test
  • Independent and Paired T-tests
  • Testing Hypotheses with T-tests
  • Loading and Analyzing a Skewed Dataset
  • Measuring Skewness and Kurtosis
  • Preparing a Dataset for Regression
  • Simple Linear Regression
  • Multiple Linear Regression
  • Exercise: Applied Inferential Statistics

Data Research Techniques

Course: 33 Minutes

  • Course Overview
  • Data Research Fundamentals
  • Data Research Steps
  • Values, Variables, and Observations
  • JMP Scale of Measurement
  • Non-experimental and Experimental Research
  • Descriptive and Inferential Statistical Analysis
  • Inferential Tests
  • Case Study of Clinical Data Research
  • Data Research in Sales Management
  • Exercise: Implement Data Research

Data Research Exploration Techniques

Course: 50 Minutes

  • Course Overview
  • Fundamentals of Exploratory Data Analysis
  • Data Exploration Types
  • Working with R
  • Data Exploration in R
  • Data Exploration Using Plots
  • Python Packages for Data Exploration
  • Data Exploration Using Python
  • Data Research Using Linear Algebra
  • Linear Algebra for Data Research
  • Exercise: R and Python for Data Exploration

Data Research Statistical Approaches

Course: 43 Minutes

  • Course Overview
  • Role of Statistics in Data Research
  • Discrete vs. Continuous Distribution
  • PDF and CDF
  • Binomial Distribution
  • Interval Estimation
  • Point and Interval Estimation
  • Data Visualization Techniques
  • Data Visualization Using R
  • Data Integration Techniques
  • Creating Plots
  • Missing Values and Outliers
  • Exercise: Statistical Methods for Data Research

Machine & Deep Learning Algorithms: Introduction

Course: 46 Minutes

  • Course Overview
  • Machine Learning Algorithms
  • How Machine Learning Works
  • Introduction to Pandas ML
  • Support Vector Machines
  • Overfitting
  • Exercise: Machine Learning and Classification

Machine & Deep Learning Algorithms: Regression & Clustering

Course: 49 Minutes

  • Course Overview
  • The Confusion Matrix
  • An Introduction to Regression
  • Applications of Regression
  • Supervised and Unsupervised Learning
  • Clustering
  • Principal Component Analysis
  • Exercise: Regression and Clustering

Machine & Deep Learning Algorithms: Data Preperation in Pandas ML

Course: 1 Hour, 4 Minutes

  • Course Overview
  • Data Preparation in scikit-learn
  • Training and Evaluating Models in scikit-learn
  • Introducing the Pandas ML ModelFrame
  • Training and Evaluating Models in Pandas ML
  • Preparing Data for Regression
  • Evaluating Regression Models
  • Preparing Data for Clustering
  • The K-Means Clustering Algorithm
  • Exercise: Regression, Classification, and Clustering

Machine & Deep Learning Algorithms: Imbalanced Datasets Using Pandas ML

Course: 1 Hour, 24 Minutes

  • Course Overview
  • Analyzing an Imbalanced Dataset
  • The RandomOverSampler
  • The SMOTE Oversampler
  • Undersampling Using imbalanced-learn
  • Ensemble Classifiers for Imbalanced Data
  • Combination Samplers
  • Finding Correlations in a Dataset
  • Building a Multi-Label Classification Model
  • Dimensionality Reduction with PCA
  • Imbalanced Learn and PCA

Creating Data APIs Using Node.js

Course: 1 Hour, 31 Minutes

  • Course Overview
  • API Prerequisites
  • Building a RESTful API Using Node.js and Express.js
  • RESTful API with OAuth
  • HTTP Server with Hapi.js
  • API Modules
  • Returning Data with JSON
  • Nodemon for Development Workflow
  • API Requests
  • POSTman for API
  • Deploying APIs
  • Social Media APIs
  • Exercise: Building RESTful APIs
Unterrichtsdauer 91:36 Stunde
Sprache Englisch
Online-Zugang 365 Tage
Teilnahmeurkunde Ja
Preisgekröntes Online-Training Ja

Es wurden noch keine Bewertungen für dieses Produkt abgegeben.

Bewertungen

Es wurden noch keine Bewertungen für dieses Produkt abgegeben.

Microsoft Office SCORM e-Learning

Möchten Sie Microsoft Office E-Learning SCORM in das LMS Ihrer Organisation integrieren? Nehmen Sie Kontakt mit uns auf.

Bewertung der Schüler

Springest: 8.9, Edubookers: 8.5

Qualitätsgarantie

Preisgekröntes E-Learning & zertifizierte Tutoren

Microsoft Partner

und Certiport Partner

Nicht Gut, Geld Zurück

und eine Starter-Garantie