Data Analyst to Data Scientist Ausbildung





Data Analyst to Data Scientist Ausbildung
Bestellen Sie diesen einzigartigen E-Learning-Kurs Data Analyst to Data Scientist online, 1 Jahr rund um die Uhr Zugriff auf umfangreiche interaktive Videos, Fortschritte bei der Berichterstellung und beim Testen.
Lesen Sie mehr- Marke:
- Data Science
- Verfügbarkeit:
- Auf Lager
- Award Winning E-learning
- Tiefpreisgarantie
- Persönlicher Service durch unser Expertenteam
- Sicher online oder per Rechnung bezahlen
- Bestellung und Start innerhalb von 24 Stunden
Data Analyst to Data Scientist E-Learning Ausbildung
Bestellen Sie diesen einzigartigen E-Learning-Kurs Data Analyst to Data Scientist online, 1 Jahr rund um die Uhr Zugriff auf umfangreiche interaktive Videos, Sprache, Fortschrittsüberwachung durch Berichte und Kapiteltests, um das Wissen direkt zu testen.
Data Science Track 1: Data Analyst
Data Science Track 2: Data Wrangle
Data Science Track 3: Data Ops
Data Science Track 4: Data Scientist
Kursinhalt
Data Science Track 1: Data Analyst
Data Architecture Primer
Course: 1 Hour, 4 Minutes
- Course Overview
- Data Defined
- Data Privacy
- The Data Lifecycle
- SQL vs. NoSQL
- Create an Entity Relationship Diagram
- Implement a SQL Solution
- Implement a NoSQL Solution
- Big Data
- Data Architecture and Governance
- IT Data System Architecture Types
- Data Analytics and Reporting
- Exercise: Implement Data Architecture Best Practices
Data Engineering Fundamentals
Course: 46 Minutes
- Course Overview
- Overview of Distributed Systems
- Batch vs. In-Memory Processing
- NoSQL Stores
- Tools for Data Management
- What is ETL?
- ETL with Talend Open Studio
- Data Modeling
- AI and Machine Learning
- Data Partitioning
- Data Engineering
- Data Reporting
- Exercise: Create a Data Model
Python for Data Science: Introduction to NumPy for Multi-dimentional Data
Course: 1 Hour
- Course Overview
- Introduction to NumPy and the NumPy Ecosystem
- Array Creation - Part 1
- Array Creation - Part 2
- Printing Arrays
- Basic Array Operations
- Universal Functions
- Indexing and Slicing
- Iterating Over Arrays
- Reshaping Arrays
- Exercise: Python NumPy Array Operations
Python for Data Science: Advanced Operations with NumPy Arrays
Course: 1 Hour, 8 Minutes
- Course Overview
- Splitting NumPy Arrays
- Images as Arrays
- Image Manipulation Using NumPy
- Views and NumPy Arrays
- Deep Copies of Arrays
- Introduction to Index Masks
- Applying Index Masks
- Indexing with Boolean Masks
- Structured Arrays
- Understanding Array Broadcasting
- Applying Broadcasting Rules on Array Operations
- Exercise: NumPy Multi-dimensional Array Operations
Python for Data Science: Introduction to Pandas
Course: 1 Hour, 6 Minutes
- Course Overview
- Features of Pandas and the Pandas Ecosystem
- Introduction to Pandas
- Work with Pandas
- Introduction to Data Frames
- Work with Data Frames
- Load Data into a Data Frame
- Add and Delete Data Frame Contents
- Select Parts of a Data Frame
- Access Pandas Data Frames
- Introduction to Multi-Indexing in a Data frame
- Reshape Data Frames
- Reshape Data frames Using Stack and Melt Operations
- Exercise: Pandas for Basic Tabular Data Manipulation
Python for Data Science: Manipulating and Analyzing Data in Pandas DataFrames
Course: 45 Minutes
- Course Overview
- Iterating Over the Contents of a Data Frame
- Exporting a Data Frame
- Sorting
- Handling Missing Data
- Grouping with a Multi-Index
- Merging Data Frames
- Applying Join Operations on Data Frames
- Pandas and Relational Databases
- Exercise: Pandas for Advanced Data Manipulation
R for Data Science: Data Structures
Course: 52 Minutes
- Course Overview
- Creating Vectors
- Manipulating Vectors
- Sorting Vectors
- Using Lists
- Creating Matrices
- Matrix Operations
- Creating Factors
- Creating Data Frames
- Data Frame Operations
- Exercise: Creating and Using a Data Frame
R for Data Science: Importing and Exporting Data
Course: 34 Minutes
- Course Overview
- Reading from CSV
- Reading from Excel
- Reading from HTML
- Exporting to CSV
- Exporting to Excel
- Exporting to HTML
- Exercise: Reading and Writing Data
R for Data Science: Data Exploration
Course: 41 Minutes
- Course Overview
- Creating dplyr Tables
- Selecting Subsets
- Filtering Tabular Data
- Piping Data
- Mutating Data
- Summarizing Data
- Combining Datasets
- Grouping Data
- Exercise: Querying Data
R for Data Science: Regression Methods
Course: 37 Minutes
- Course Overview
- Linear Data Preparation
- Creating Linear Models
- Interpreting Model Output
- Using Linear Prediction
- Logistic Data Preparation
- Using glm
- Exercise: Creating a Linear Model
R for Data Science: Classification & Clustering
Course: 39 Minutes
- Course Overview
- Preparing Data for Classification
- Using rpart
- Using ctree
- Preparing Data for Clustering
- Using K-Means Clustering
- Using Hierarchical Clustering
- Exercise: Creating a Decision Tree
Data Science Statistics: Simple Descriptive Statistics
Course: 1 Hour, 11 Minutes
- Course Overview
- Descriptive and Inferential Statistics
- Population vs. Sample
- Probability vs. Non-Probability Sampling
- Mean
- Median
- Mode
- IQR
- Variance
- Exercise: Using Descriptive Statistics
Data Science Statistics: Common Approaches to Sampling Data
Course: 47 Minutes
- Course Overview
- Terms in Sampling
- Sampling Bias
- Simple Random Sampling
- Systematic Random Samplin
- Stratified Sampling
- Non-Probability Sampling
- Exercise: Efficient and Correct Sampling
Data Science Statistics: Inferential Statistics
Course: 1 Hour, 2 Minutes
- Course Overview
- Gaussian Distribution
- Inferential Statistics and Hypothesis Testing
- Simplified Example of Hypothesis Testing
- T-tests
- Skewness and Kurtosis
- Correlation and Autocorrelation
- Introducing Linear Regression
- Overfitting and Goodness-of-Fit
- Exercise: Basic Inferential Statistics
Accessing Data with Spark: An Introduction to Spark
Course: 1 Hour, 7 Minutes
- Course Overview
- Introduction to Spark and Hadoop
- Resilient Distributed Datasets (RDDs)
- RDD Operations
- Spark Data Frames
- Spark Architecture
- Spark Installation
- Working with RDDs
- Creating Data Frames from RDDs
- Contents of a Data Frame
- The SQL Context
- The map() Function of an RDD
- Accessing the Contents of a Data Frame
- Data Frames in Spark and Pandas
- Exercise: Working with Spark
Getting Started with Hadoop: Fundamentals & MapReduce
Course: 1 Hour, 4 Minutes
- Course Overview
- An Introduction to Big Data
- Building Systems to Scale with Data
- A Quick Overview of Hadoop
- MapReduce Overview
- The Map Phase of a MapReduce
- The Shuffle and Reduce Phases
- Exercise: Fundamentals of Hadoop and MapReduce
Getting Started with Hadoop: Developing a Basic MapReduce Application
Course: 1 Hour, 14 Minutes
- Course Overview
- Provisioning a Hadoop Cluster on the Cloud
- Browsing the Hadoop Web Applications
- Creating a MapReduce project
- Coding the Map Phase
- Coding the Reduce Phase
- Defining the Driver Program
- Building the Application
- Executing the MapReduce Application
- Exercise: Developing a Basic MapReduce Application
Hadoop HDFS: Introduction
Course: 1 Hour, 15 Minutes
- Course Overview
- Scaling Datasets
- Horizontal Scaling for Big Data
- Distributed Clusters and Horizontal Scaling
- Overview of HDFS
- HDFS Architectures
- MapReduce for HDFS
- YARN for HDFS
- The Mechanism of Resource Allocation in Hadoop
- Apache Zookeeper for HDFS
- The Hadoop Ecosystem
- Exercise: An Introduction to HDFS
Hadoop HDFS: Introduction to the Shell
Course: 53 Minutes
- Course Overview
- Creating a Hadoop Cluster on the Google Cloud
- Exploring Hadoop Clusters
- The YARN Cluster Manager UI
- The HDFS Name Node UI
- Browsing the Packaged Hadoop Tools
- Configuring HDFS
- The HDFS Shells
- Exercise: Introduction to the HDFS Shell
Hadoop HDFS: Working with Files
Course: 48 Minutes
- Course Overview
- Basic Directory Commands in HDFS
- Using the copy From Local Command in HDFS
- Using the put Command in HDFS
- Using the copy To Local Command in HDFS
- Retrieving files from HDFS
- Append and Delete Operations in HDFS
- Exercise: Working with Files on HDFS
Hadoop HDFS: File Permissions
Course: 49 Minutes
- Course Overview
- The HDFS count and du Commands
- Viewing and Setting File Permissions in HDFS
- Applying Permissions Recursively in HDFS
- An Introduction to Bash Scripting
- Scripting HDFS Operations
- Exploring the HDFS Name Node UI
- Cleanup Operations in HDFS
- Exercise: File Permissions on HDFS
Data Silos, Lakes, & Streams: Introduction
Course: 1 Hour, 20 Minutes
- Course Overview
- Data Silos
- Data Lakes
- Characteristics of Data Lakes
- Data Lake Architecture, Features, and Challenges
- Data Warehouses
- Data Warehouses vs. Data Lakes
- Data Streams
- Migrating Data to AWS
- Data Lakes on AWS
- Working with Data Lakes on AWS
- Exercise: Data Silos, Lakes, and Streams
Data Silos, Lakes, and Streams: Data Lakes on AWS
Course: 1 Hour, 10 Minutes
- Course Overview
- Create a Role for the AWS Glue Service
- Upload Data to S
- Explore the Glue Web Console
- Manually Create Glue Tables
- Query the Data Lake Using Amazon Athena
- Configure and Run Glue Crawlers
- Access Data in Crawled Tables
- Crawl Multiple CSV Files in the Same Folder Path
- Merge Data in Multiple Files in the Same Folder Path
- Work with Files Having the Exact Same Schema
- Exercise: Data Lakes on AWS with S3 and Glue
Data Silos, Lakes, & Streams: Sources, Visualizations, & ETL Operations
Course: 1 Hour, 29 Minutes
- Course Overview
- Set Up a Redshift Cluster
- Create Tables and Load Data From S
- Establish a JDBC Connection to Redshift
- Crawl Redshift Using a JDBC Connection
- Crawl DynamoDB
- Configure Quick Sight to Visualize Data
- Visualize Data in Quick Sight
- Configure a Job to Perform Extract, Transform, Load
- Execute an ETL Operation in Glue
- Perform ETL to Back Up Redshift Data in S3 Buckets
- Perform ETL to Back Up DynamoDB Data in S3 Buckets
- Exercise: Multiple Sources, Visualizations, and ETL
Data Analysis Application
Course: 1 Hour, 25 Minutes
- Course Overview
- Install and Configure Anaconda Python
- Install R Using Anaconda
- Use Jupyter Notebook
- Import and Export Data in Python
- Import and Export Data in R
- Deal with Missing Data in R
- Transform Data in R
- Work with Numpy
- Work with Pandas
- Mean, Median, and Mode in R
- Analyze Data with Pandas
- Plot Data in R
- Visualize Data in Python
- Exercise: Perform Data Analysis
Data Science Track 2: Data Wrangler
Data Wrangling with Pandas: Working with Series & DataFrames
Course: 1 Hour, 11 Minutes
- Course Overview
- Installing Pandas
- Pandas Series Objects
- Operations on Series
- Appending and Sorting Series Values
- Pandas DataFrames
- Indexing Operations with DataFrames
- Missing Data
- Column Aggregations
- Statistical Operations
- Exercise: Operations on Series and DataFrames
Data Wrangling with Pandas: Visualizations and Time-Series Data
Course: 1 Hour, 29 Minutes
- Course Overview
- Pandas and Matplotlib for Visualizations
- Pie Charts, Box Plots, and Scatter Plots
- Time-Series Data
- Deltas and Percentage Change Calculations
- Time Deltas and Date Ranges
- Mismatched DataFrames and Missing Data
- Working with String Data
- Advanced Operations on Strings
- Applying Functions on Series
- Transforming Data With User-Defined Functions
- Applying Functions on DataFrames
- Exercise: Plot Charts and Transform Column Values
Data Wrangling with Pandas: Advanced Features
Course: 1 Hour, 12 Minutes
- Course Overview
- Grouping and Aggregations
- MultiIndex DataFrames
- Grouping and Aggregations with MultiIndex DataFrames
- General Aggregation Functions
- Filtering
- Masking Column Values
- Working with Duplicates
- Working with Categorical Data
- Filtering, Adding, and Removing Categories
- Reindexing
- Exercise: Filtering, Duplicates and Categorical Data5
Data Wrangler 4: Cleaning Data in R
Course: 1 Hour, 3 Minutes
- Course Overview
- Types of Unclean Data
- Data Quality
- Downloading JSON Data
- Excel Sheets
- Reading Dirty CSVs
- Querying Relational Databases
- Joining Tabular Data
- Spreading Data
- Summarizing Data
- Imputing Data
- Extracting Matches
- Exercise: Wrangling Data
Data Tools: Technology Landscape & Tools for Data Management
Course: 27 Minutes
- Course Overview
- Technology Landscape and Tools
- Tool Comparison
- Machine Learning in Data Analytics
- Machine Learning Tools
- Machine Learning Implementation
- Python and R for Data Management
- Cloud and Machine Learning
- Exercise: Implement Machine Learning on Scikit-learn
Data Tools: Machine Learning & Deep Learning in the Cloud
Course: 23 Minutes
- Course Overview
- Microsoft Machine Learning Toolkit
- AWS and Machine Learning
- Spark Machine Learning Capabilities
- Deep Learning Frameworks
- Deep Learning Implementation
- Data Mining and Analytical Tools
- KNIME Capabilities
- Exercise: Implement Deep Learning
Trifacta for Data Wrangling: Wrangling Data
Course: 50 Minutes
- Course Overview
- Standardizing Data
- Formatting Dates
- Filtering Rows
- Replacing Values
- Counting Matches
- Splitting Columns
- Merging Columns
- Extracting Data
- Conditional Aggregation
- Reshaping Data
- Joining Data
- Exercise: Wrangling Data
MongoDB for Data Wrangling: Querying
Course: 1 Hour, 8 Minutes
- Course Overview
- Introduction to PyMongo
- Document Structure
- CRUD Operations
- ObjectID and Timestamp
- Query Operations
- Projection Queries
- Comparison Operators
- Element Query Operators
- The Regex Operator
- Using the Size and All Operators
- Text Search
- Using mongoimport
- Using mongoexport
- Exercise: Performing a Query
MongoDB for Data Wrangling: Aggregation
Course: 51 Minutes
- Course Overview
- Aggregation Framework
- Using Group
- Using Match
- Using Project
- Using Limit and Sort
- Using Unwind
- Using Lookup
- Using Indexes
- Using Geospatial Indexes
- Exercise: Performing an Aggregate Query
Getting Started with Hive: Introduction
Course: 56 Minutes
- Course Overview
- Hive as a Data Warehouse
- Overview of Relational Databases
- OLTP and OLAP
- Hive and the Hadoop Ecosystem
- Hive Server and The Metastore
- Hive on Cloud Computing Platforms
- Data Types in Hive
- Data and Tables in Hive
- Exercise: Introduction to Hive
Getting Started with Hive: Loading and Querying Data
Course: 1 Hour, 20 Minutes
- Course Overview
- Setting up a Hadoop Cluster on the Google Cloud
- Creating a Hive Table
- Running Simple Queries in Hive
- Executing Hive Queries from the Shell
- Joining Tables in Hive
- Exploring the Hive Warehouse
- External Tables in Hive
- Modifying Tables in Hive
- Temporary Tables in Hive
- Loading Data into Tables in Hive
- Populating Multiple Tables in Hive
- Exercise: Loading and Querying Data in Hive
Getting Started with Hive: Viewing and Querying Complex Data
Course: 1 Hour, 14 Minutes
- Course Overview
- The Array Data Type in Hive
- The Map Data Type in Hive
- The Struct Type in Hive
- The explode and posexplode Functions in Hive
- Lateral Views in Hive
- Multiple Lateral Views in Hive
- Set Operations in Hive
- The IN and EXISTS clauses in Hive
- Creating and Populating Tables in Hive
- Views in Hive6
- Exercise: Viewing and Querying Complex Data
Getting Started with Hive: Optimizing Query Executions
Course: 43 Minutes
- Course Overview
- Hive Queries as MapReduce Jobs
- Techniques to Improve Query Performance in Hive
- Partitioning Tables in Hive
- Bucketing Tables in Hive
- Structuring Join Queries in Hive
- Exercise: Optimizing Query Execution in Hive
Getting Started with Hive: Optimizing Query Executions with Partitioning
Course: 1 Hour, 1 Minute
- Course Overview
- Setting up a Hadoop Cluster on the Google Cloud
- Creating a Partitioned Table in Hive
- Working with Partitions in Hive
- Populating Partitions in Hive
- Partitioning External Tables in Hive
- Modifying Partitions in Hive
- Dynamic Partitions in Hive
- Using Multiple Columns for Partitioning in Hive
- Exercise: Optimize Executions with Partitioning
Getting Started with Hive: Bucketing & Window Functions
Course: 1 Hour, 4 Minutes
- Course Overview
- Apply Bucketing for a Table in Hive
- Using Bucketing and Partitioning Together in Hive
- Sorting a Bucket's Contents in Hive
- Sampling a Table in Hive
- Joining Multiple Tables in Hive
- Introducing Window Functions in Hive
- Windows Functions with Partitions in Hive
- Exercise: Bucketing and Window Functions in Hive
Getting Started with Hadoop: Filtering Data Using MapReduce
Course: 59 Minutes
- Course Overview
- Counting the Data Points in Each Category
- The Reducer and Driver Programs
- Building and Executing the Application
- A Simple Filter Using MapReduce
- Executing and Examining the Output
- Extracting the Unique Values in a Column
- Viewing the Distinct Values Extracted
- Exercise: Filtering Data Using MapReduce
Getting Started with Hadoop: MapReduce Applications With Combiners
Course: 1 Hour, 24 Minutes
- Course Overview
- Combiners in MapReduce
- Revisiting MapReduce
- Working with Combiners
- Using Combiners for Calculating Averages
- Creating a Project to Calculate Averages
- Coding the Map and Reduce Phases
- Configure the Application in the Driver
- Executing the Application and Examining the Output
- Adding a Combiner to a MapReduce Application
- Conveying a Pair of Numbers from the Mapper
- Running the Fixed Application
- Exercise: Optimizing MapReduce With Combiners
Getting Started with Hadoop: Advanced Operations Using MapReduce
Course: 49 Minutes
- Course Overview
- Defining a User-Defined Type for a PriorityQueue
- Implementing a PriorityQueue in a Mapper
- Using a PriorityQueue in a Reducer
- Running and Verifying the Results
- Building an Inverted Index - Map Phase
- Building an Inverted Index - Reduce Phase
- Executing the Application and Viewing the Index
- Exercise: Advanced Operations Using MapReduce
Accessing Data with Spark: Data Analysis Using the Spark DataFrame API
Course: 1 Hour, 12 Minutes
- Course Overview
- Performance Improvements in Spark
- Broadcast Variables and Accumulators
- Loading Data into a DataFrame
- Sampling the Contents of a DataFrame
- Grouping and Aggregations
- Visualizing Data in a DataFrame
- Trimming and Cleaning Data
- User-Defined Functions and DataFrames
- Combining Filters, Aggregations, and Sorting
- Using Broadcast Variables
- Using Accumulators
- Exporting DataFrame Contents
- Custom Accumulators
- Join Operations4
- Exercise: Data Analysis Using the DataFrame API4
Accessing Data with Spark: Data Analysis using Spark SQL
Course: 55 Minutes
- Course Overview
- The Spark Catalyst Optimizer
- Introduction to Spark SQL
- Preparing Data for Analysis
- Running SQL Queries
- Inferred and Explicit Schemas
- Windowing in Spark
- Applying Window Functions
- Exercise: Data Analysis Using Spark SQL
Data Lake: Framework & Design Implementation
Course: 34 Minutes
- Course Overview
- Data Lakes and Data Warehouses
- Data Lake Selection Criteria
- Data Lake and Data Democratization
- Data Lake Design Principles
- AWS Data Lake Architecture
- Implement AWS Data Store
- Data Lake For On-Premise and Multi-Cloud
- Data Processing Frameworks for Data Lake
- Exercise: Implement AWS Data Store
Data Lake: Architectures & Data Management Principles
Course: 35 Minutes
- Course Overview
- Real-Time Big Data Architectures
- Data Lake Reference Architecture
- Data Ingestion and File Formats
- Ingestion Using Sqoop
- Data Processing Strategies
- Deriving Value from Data Lakes
- Data Life Cycle
- S3 and Glacier
- Exercise: Ingest Data and Implement Archival Policy
Data Architecture - Deep Dive: Design & Implementation
Course: 36 Minutes
- Course Overview
- Data Complexity Management Strategies
- Data Modeling Process
- Distributed Data Management
- Partitioning Methods and Criteria
- MongoDB Partitioning
- Hybrid Data Architectures
- Implement Directed Acyclic Graph
- CAP Theorem
- Batch vs. Streaming
- Read and Write Concerns
- Exercise: Implement Serverless Architecture
Data Architecture - Deep Dive: Microservices & Serverless Computing
Course: 26 Minutes
- Course Overview
- Microservices and Data
- Serverless and Lambda Architecture
- Lambda Implementation
- Cluster Benefits
- Data Architecture Types
- Data Discovery Process
- Data Risk Types
- Data POC
- Exercise: Implement Lambda Architecture
Data Science Track 3: Data Ops
Deploying Data Tools: Data Science Tools
Course: 48 Minutes
- Course Overview
- Data Science Platform
- Challenges of Deploying Data Science Tools
- Considerations for Data Science Tools
- Data Science Workflow
- Data Science Analytic Tools
- Data Science Visualization Tools
- Data Science Database Tools
- Benefits of Deploying Cloud-Based Tools
- Challenges of Deploying Cloud-Based Tools
- What is DevOps
- DevOps for Data Science
- Exercise: Identifying Uses of Data Science Tools
Delivering Dashboards: Management Patterns
Course: 34 Minutes
- Course Overview
- Analytical Visualization
- Dashboard Types
- Data Management
- Dashboard Components
- Dashboard Best Practices
- Dashboard Using ELK
- Dashboard Using Power BI
- Chart Selection Criteria
- Leaderboards and Scorecards
- Scorecard Types
- Exercise: Create Dashboards with PowerBI and ELK
Delivering Dashboards: Exploration & Analytics
Course: 31 Minutes
- Course Overview
- Data Exploration Using Charts
- Analytical Visualization Tools
- Bar and Line Charts
- Dashboarding with Kibana
- Dashboard Sharing with Kibana
- Dashboarding with Tableau
- Dashboarding with Qlikview
- Data Ingest and Dashboards
- Dashboard Patterns
- Monitoring Dashboards
- Exercise: Create Dashboards Using Kibana and Tableau
Cloud Data Architecture: DevOps & Containerization
Course: 45 Minutes
- Course Overview
- Containerization on the Cloud
- Benefits of Containers
- Serverless Computing
- DevOps in the Cloud
- AWS OpsWorks
- Storage Classification
- Cloud and Machine Learning
- Cloud and BI Analytics
- Exercise: Containerization and Serverless Computing
Compliance Issues and Strategies: Data Compliance
Course: 44 Minutes
- Course Overview
- Data Compliance Issues
- Data Regulations
- The Importance of Global Standards
- Risk and Company Standards
- Myths and Facts of Data Compliance
- Compliance Training for Users
- Compliance Training for Management
- The Benefits of a Data Compliance Program
- Elements of a Good Compliance Strategy
- Building a Compliance Strategy
- Reporting and Response Procedures
- Exercise: Explain the Importance of Data Compliance
Implementing Governance Strategies
Course: 46 Minutes
- Course Overview
- Governance and its Relationship with Big Data
- Why Big Data Requires Governance
- Requirements for Big Data Governance
- Why is Big Data Different?
- Identifying Data
- Identifying Stakeholders
- Cloud Technologies and Data Governance
- Designing a Data Governance Process
- Managing a Data Governance Strategy
- Monitoring a Data Governance Strategy
- Maintaining a Data Governance Strategy
- Exercise: Defining Data Governance Strategies
Data Access & Governance Policies: Data Access Oversight and IAM
Course: 59 Minutes
- Course Overview
- Data Access Governance
- Risk and Data Safety Compliance
- Data Access Patterns
- Data Breach Prevention
- Least Privilege
- Assign and View Effective File System Permissions
- Identity and Access Management
- Create an AWS IAM User and Group
- Assign AWS IAM Group Permissions
- Vulnerability Assessments
- Implement Effective Security Controls
- Exercise: Implement Data Access Governance Solutions
Data Access & Governance Policies: Data Classification, Encryption, and Monitoring
Course: 1 Hour, 19 Minutes
- Course Overview
- Data Classification
- Classify Data Using Microsoft FSRM
- Data Encryption
- Encrypt Data at Rest
- Encrypt Data in Motion
- Implement Security Compliance Checking
- Examine Data Access Trends
- Data Access Monitoring Solutions
- Logging, Auditing, and Data Analytics
- Configure a Custom Filtered Log View
- Enable Windows Data Access Auditing
- Exercise: Implement Data Confidentiality
Streaming Data Architectures: An Introduction to Streaming Data
Course: 51 Minutes
- Course Overview
- Introduction to Streaming data
- The Stream Processing Model
- The Message Transport
- Stream Processing with RDDs
- Structured Streaming for Continuous Applications
- Streaming vs Structured Streaming
- Triggers and Output Modes
- Exercise: Working with Streaming Data
Streaming Data Architectures: Processing Streaming Data
Course: 53 Minutes
- Course Overview
- PySpark Setup
- Setting Up a Socket Stream with Netcat
- The Update Output Mode
- Using a File Input Stream
- The Append Output Mode
- The Complete Output Mode
- Aggregations on Streaming Data
- SQL Operations on Streaming Data
- User-Defined Functions (UDFs)
- Exercise: Processing Streaming Data
Scalable Data Architectures: Introduction
Course: 53 Minutes
- Course Overview
- Scalable Architectures with Distributed Computing
- Introducing Data Warehouses
- Contrasting Warehouses with Relational Databases
- Data Warehouses for Analytical Processing
- Data Warehouse Architectural Components
- Amazon Redshift - A Data Warehouse on the Cloud
- Exercise: Scalable Data Architectures
Scalable Data Architectures: Introduction to Amazon Redshift
Course: 55 Minutes
- Course Overview3
- Provisioning a Redshift Cluster Using Quick Launch
- Creating a Redshift Cluster With Additional Detail
- Exploring the Redshift Configs and Metrics
- Attaching an IAM Role to a Redshift Cluster
- Creating an AWS User to Work With Redshift
- Installing and Configuring the AWS CLI8
- Running Queries from the Redshift Query Editor
- Exercise: An Introduction to Amazon Redshift
Scalable Data Architectures: Working with Amazon Redshift & QuickSight
Course: 1 Hour, 18 Minutes
- Course Overview
- Loading Data from Amazon S3 to a Redshift Cluster
- Running Queries and Evaluating Their Execution
- Querying a Redshift Cluster Using a SQL client
- Working with Automated Snapshots
- Restoring Tables from a Snapshot
- Horizontal Scaling of a Redshift Cluster
- Vertical and Horizontal Scaling of a Cluster
- Configuring Access from Quick Sight to Redshift
- Loading a Dataset to Quick Sight
- Creating Visualizations with Quick Sight
- Exercise: Working with Redshift and Quick Sight
Building Data Pipelines
Course: 1 Hour, 10 Minutes
- Course Overview
- Data Pipelines Overview
- Traditional ETL Pipeline with Batch Processing
- Data Pipeline Tools
- Setup and Install Airflow
- Apache Airflow
- Airflow Workflows
- Airflow Tasks
- Airflow Dependencies
- ETL Pipeline with Airflow
- Automated Pipeline without ETL
- Airflow Command Line Testing
- Exercise: Using Apache Airflow
Data Pipeline: Process Implementation Using Tableau & AWS
Course: 39 Minutes
- Course Overview
- Data Pipeline
- Data Pipeline Processes
- Data Pipeline Stages
- Data Pipeline Technologies
- Data Source Types
- Scheduled Data Pipeline
- Tableau Server and Utilities
- Data Pipeline Using Tableau
- Data Pipeline on AWS
- Exercise: Build Data Pipelines with Tableau
Data Pipeline: Using Frameworks for Advanced Data Management
Course: 33 Minutes
- Course Overview
- Celery and Luigi
- Data Pipeline with Python Luigi
- Working with Dask Library
- Dask Arrays
- Data Exploration and Visualization Frameworks
- Spark and Tableau
- Streaming Data Visualization with Python
- Data Pipeline Open Source Tools
- Exercise: Implement Data Pipelines with Luigi
Data Sources: Integration
Course: 40 Minutes
- Course Overview
- Elements of IoT Solutions
- Service Categories in IoT
- IoT Capabilities and Maturity Model
- IoT Design Principles
- IoT Cloud Architectures
- MQTT and XXMP
- IoT Controllers
- IoT Data Management
- Securing IoT
- Exercise: Generating Data Streams
Data Sources: Implementing Edge on the Cloud
Course: 31 Minutes
- Course Overview
- AWS IoT Greengrass
- GCP IoT Edge
- AWS IoT over WebSockets
- IoT Device Simulator
- Generating Streams of Data Using MQTT
- Exercise: Working with IoT Device Simulators
Securing Big Data Streams
Course: 1 Hour, 3 Minutes
- Course Overview
- Big Data Security Concerns
- Streaming Data Security Concerns
- NoSQL Database Security Concerns
- Distributed Processing Security Risks
- Data Mining and Analytics Privacy Flaws
- End-Point Device Tampering Risks
- Secure Big Data
- Secure Data Streams
- Secure Data In Motion
- End-Point Input Validation and Filtering
- Secure Data at Rest with Symmetric Ciphers
- Exercise: Securing Big Data Streams
Harnessing Data Volume & Velocity: Big Data to Smart Data
Course: 39 Minutes
- Course Overview
- Comparing Big Data and Smart Data
- Smart Data and Edge Technologies
- Big Data to Smart Data Formation
- Smart Data and Smart Processes
- Smart Data Use Cases
- Smart Data Life Cycle
- Big Data to Smart Data Using k-NN
- Smart Data Frameworks
- Smart Data to Business
- Clustering Smart Data
- Smart Data Integration
- Exercise: Transform Big Data to Smart Data
Data Rollbacks: Transaction Rollbacks & Their Impact
Course: 36 Minutes
- Course Overview
- Rollback Process
- State of Transactions
- Transaction Types
- SQL Transaction Management
- Transaction Log Operations
- Deadlock Management
- SQL Server Rollback Mechanism
- SQL Server Rollback Mechanism Implementation
- Exercise: Implement Transactions with SQL Server
Data Rollbacks: Transaction Management & Rollbacks in NoSQL
Course: 29 Minutes
- Course Overview
- NoSQL and SQL Transaction Management
- MongoDB Transactions
- Manage Multi-Document Transactions in MongoDB
- Change Data Capture
- Change Stream in MongoDB
- MongoDB Change Stream Implementation
- Exercise: MongoDB Transactions and Change Streams
Data Science Track 4: Data Scientist
Balancing the Four Vs of Data: The Four Vs of Data
Course: 40 Minutes
- Course Overview
- Overview of the Four Vs
- The Importance of Volume
- The Importance of Variety
- The Importance of Velocity
- The Importance of Veracity
- The Relationship Between the Four Vs
- Variety and Data Structure
- Validity and Volatility
- Finding Balance in the Four Vs
- Use Cases
- Extracting Value from the Four Vs
- Exercise: Describe the Four Vs of Big Data
Data Driven Organizations
Course: 1 Hour, 15 Minutes
- Course Overview
- Data Driven Organizations
- Decision Making
- Analytic Maturity
- Analytic Roles
- Data Source Priority
- Facets of Data Quality
- Power BI Data Visualization
- Missing Data
- Duplicate Data
- Truncated Data
- Data Provenance
- Exercise: Use Informatica Data Quality
Raw Data to Insights: Data Ingestion & Statistical Analysis
Course: 54 Minutes
- Course Overview
- Statistical Analysis
- Data Correction
- Outlier Detection
- Data Architecture Pattern
- Data Ingestion Tools
- Kafka and Apache NiFi
- Apache Sqoop Ingest
- Ingest Using WaveFront
- Exercise: Detecting Outliers and Ingesting Data
Raw Data to Insights: Data Management & Decision Making
Course: 57 Minutes
- Course Overview
- Data-driven Decision Making Framework
- Loading Data into R
- Preparing Data
- Data Correction Approach
- Data Correction Using Simple Transformation
- Data Correction Using Deductive Correction
- Distributed Data Management
- Data Analytics
- Data Analytics Using R
- Predictive Modeling
- Exercise: Correcting and Modelling Data
Tableau Desktop: Real Time Dashboards
Course: 1 Hour, 8 Minutes
- Course Overview
- Introducing Real Time Dashboards
- Creating Real Time Dashboards with Tableau
- Build a Tableau Dashboard
- Real Time Dashboard Updates in Tableau
- Organizing Your Tableau Dashboard
- Formatting Your Tableau Dashboard
- Interactive Tableau Dashboard
- Tableau Dashboard Starters
- Tableau Dashboard Extensions
- Tableau Dashboards and Story Points
- Sharing your Tableau Dashboard
- Exercise: Creating a Tableau Dashboard Starter
Storytelling with Data: Introduction
Course: 47 Minutes
- Course Overview
- Storytelling Process
- Interpreting Context
- Analysis Types
- Who, What, and How of Storytelling
- Visualization for Storytelling
- Graphical Tools for Data Elaboration
- Storytelling Scenarios
- Storyboarding
- Exercise: Visualization and Graphical Tool
Storytelling with Data: Tableau & PowerBI
Course: 57 Minutes
- Course Overview
- Visual Selection
- Slopegraphs
- Bar Charts and Types of Bar Charts
- Clutter and Clutter Elimination
- Gestalt Principle
- Story Design Best Practices
- Tools for Storytelling
- Decluttering
- Crafting Visual Data
- Visual Design Concerns
- Storytelling with Power BI
- Model Visual and Tableau
- Exercise: Storytelling with Power BI
Python for Data Science: Basic Data Visualization Using Seaborn
Course: 1 Hour, 7 Minutes
- Course Overview
- Introduction to Seaborn
- Install Seaborn
- Simple Univariate Distributions
- Configure Univariate Distribution Plots
- Simple Bivariate Distributions
- Explore Different Types of Bivariate Distributions
- Analyze Multiple Variable Pairs
- Regression Plots
- Themes and Styles in Seaborn
- Exercise: Basic Data Visualization Using Seaborn
Python for Data Science: Advanced Data Visualization Using Seaborn
Course: 1 Hour, 4 Minutes
- Course Overview
- Searching for Patterns in a Dataset
- Configuring Plot Aesthetics
- Normal Distribution and Outliers
- Distributions Within Categories - Part 1
- Distributions Within Categories - Part 2
- Analyzing Categories with Facet Grids - Part 1
- Analyzing Categories with Facet Grids - Part 2
- Introducing Color Palettes
- Using Color Palette8
- Exercise: Advanced Data Visualization Using Seaborn
Data Science Statistics: Using Python to Compute & Visualize Statistics
Course: 1 Hour, 16 Minutes
- Course Overview
- An Introduction to Matplotlib
- Analyzing Data Using NumPy and Pandas
- Visualizing Univariate and Bivariate Distributions
- Summary Statistics Using Native Python Functions
- Summary Statistics Using NumPy
- Summary Statistics Using the SciPy Library
- Correlation and Covariance
- Z-score
- Exercise: Compute and Visualize Statistics6 MinutesCompletedActions
R for Data Science: Data Visualization
Course: 33 Minutes
- Course Overview
- Using Scatter Plots
- Using Line Graphs
- Using Bar Charts
- Using Box and Whisker Plots
- Using Histograms
- Using a Bubble Plot
- Exercise: Data Visualization
Advanced Visualizations & Dashboards: Visualization Using Python
Course: 38 Minutes
- Course Overview
- Relevance of Data Visualization for Business
- Libraries for Data Visualization in Python
- Python Data Visualization Environment Configuration
- Matplotlib Libraries for Visualization
- Bar Chart Using ggplot
- Bokeh and Pygal
- Select Visualization Libraries
- Interactive Graphs and Image Files
- Plot Graphs
- Multiple Lines in Graphs
- Exercise: Create Line Charts with Pygal
Advanced Visualizations & Dashboards: Visualization Using R
Course: 35 Minutes
- Course Overview
- Chart Types
- Stacked Bar Plot
- Animate Plots with Matplotlib
- Plotting in Jupyter Notebook
- Graphics in R
- Heat Map and Scatter Plot in R
- Correlogram and Area Chart in R
- ggplot2 Capabilities
- Customize ggplot2 Graphs
- Exercise: Creating Heat Maps and Scatter Plots
Powering Recommendation Engines: Recommendation Engines
Course: 1 Hour, 5 Minutes
- Course Overview
- Describing Recommendation Engines
- Comparing the Types of Recommendation Engines
- Collecting and Manipulating Data
- Manipulating Data in R
- Describing Similarity and Neighborhoods
- Creating a Recommendation Engine
- Recommending Another Item
- Finding Items to Recommend
- Recommending Items Based on Other Items
- Evaluating a Recommendation System
- Validating a Recommendation System
- Exercise: Creating a Recommendation Engine
Data Insights, Anomalies, & Verification: Handling Anomalies
Course: 46 Minutes
- Course Overview
- Data and Anomaly Sources
- Decomposition and Forecasting
- Examine Data Using Randomization Tests
- Anomaly Detection
- Anomaly Detection Techniques
- Anomaly Detection with scikit-learn
- Anomaly Detection Tools
- Anomaly Detection Rules
- Exercise: Detecting Anomalies
Data Insights, Anomalies, & Verification: Machine Learning & Visualization Tools
Course: 51 Minutes
- Course Overview
- Machine Learning Anomaly Detection Techniques
- Comparing Anomaly Detection Algorithms
- Anomaly Detection Using R5
- Online Anomaly Detection Components
- Online Anomaly Detection Approaches
- Anomaly Detection Use Cases
- Anomaly Detection with Visualization Tools
- Anomaly Detection with Mathematical Approaches
- Cluster-Based Anomaly Detection
- Exercise: Detecting Anomalies
Data Science Statistics: Applied Inferential Statistics
Course: 1 Hour, 19 Minutes
- Course Overview
- The One-Sample T-test
- Independent and Paired T-tests
- Testing Hypotheses with T-tests
- Loading and Analyzing a Skewed Dataset
- Measuring Skewness and Kurtosis
- Preparing a Dataset for Regression
- Simple Linear Regression
- Multiple Linear Regression
- Exercise: Applied Inferential Statistics
Data Research Techniques
Course: 33 Minutes
- Course Overview
- Data Research Fundamentals
- Data Research Steps
- Values, Variables, and Observations
- JMP Scale of Measurement
- Non-experimental and Experimental Research
- Descriptive and Inferential Statistical Analysis
- Inferential Tests
- Case Study of Clinical Data Research
- Data Research in Sales Management
- Exercise: Implement Data Research
Data Research Exploration Techniques
Course: 50 Minutes
- Course Overview
- Fundamentals of Exploratory Data Analysis
- Data Exploration Types
- Working with R
- Data Exploration in R
- Data Exploration Using Plots
- Python Packages for Data Exploration
- Data Exploration Using Python
- Data Research Using Linear Algebra
- Linear Algebra for Data Research
- Exercise: R and Python for Data Exploration
Data Research Statistical Approaches
Course: 43 Minutes
- Course Overview
- Role of Statistics in Data Research
- Discrete vs. Continuous Distribution
- PDF and CDF
- Binomial Distribution
- Interval Estimation
- Point and Interval Estimation
- Data Visualization Techniques
- Data Visualization Using R
- Data Integration Techniques
- Creating Plots
- Missing Values and Outliers
- Exercise: Statistical Methods for Data Research
Machine & Deep Learning Algorithms: Introduction
Course: 46 Minutes
- Course Overview
- Machine Learning Algorithms
- How Machine Learning Works
- Introduction to Pandas ML
- Support Vector Machines
- Overfitting
- Exercise: Machine Learning and Classification
Machine & Deep Learning Algorithms: Regression & Clustering
Course: 49 Minutes
- Course Overview
- The Confusion Matrix
- An Introduction to Regression
- Applications of Regression
- Supervised and Unsupervised Learning
- Clustering
- Principal Component Analysis
- Exercise: Regression and Clustering
Machine & Deep Learning Algorithms: Data Preperation in Pandas ML
Course: 1 Hour, 4 Minutes
- Course Overview
- Data Preparation in scikit-learn
- Training and Evaluating Models in scikit-learn
- Introducing the Pandas ML ModelFrame
- Training and Evaluating Models in Pandas ML
- Preparing Data for Regression
- Evaluating Regression Models
- Preparing Data for Clustering
- The K-Means Clustering Algorithm
- Exercise: Regression, Classification, and Clustering
Machine & Deep Learning Algorithms: Imbalanced Datasets Using Pandas ML
Course: 1 Hour, 24 Minutes
- Course Overview
- Analyzing an Imbalanced Dataset
- The RandomOverSampler
- The SMOTE Oversampler
- Undersampling Using imbalanced-learn
- Ensemble Classifiers for Imbalanced Data
- Combination Samplers
- Finding Correlations in a Dataset
- Building a Multi-Label Classification Model
- Dimensionality Reduction with PCA
- Imbalanced Learn and PCA
Creating Data APIs Using Node.js
Course: 1 Hour, 31 Minutes
- Course Overview
- API Prerequisites
- Building a RESTful API Using Node.js and Express.js
- RESTful API with OAuth
- HTTP Server with Hapi.js
- API Modules
- Returning Data with JSON
- Nodemon for Development Workflow
- API Requests
- POSTman for API
- Deploying APIs
- Social Media APIs
- Exercise: Building RESTful APIs
Unterrichtsdauer | 91:36 Stunde |
---|---|
Sprache | Englisch |
Online-Zugang | 365 Tage |
Teilnahmeurkunde | Ja |
Preisgekröntes Online-Training | Ja |
Es wurden noch keine Bewertungen für dieses Produkt abgegeben.
Bewertungen
Es wurden noch keine Bewertungen für dieses Produkt abgegeben.
Microsoft Office SCORM e-Learning
Möchten Sie Microsoft Office E-Learning SCORM in das LMS Ihrer Organisation integrieren? Nehmen Sie Kontakt mit uns auf.