This course runs for a duration of 5 days.
The class will run daily from 10:00 AM ET to 6:00 PM ET.
Class Location: Knowledge Transfer - Eagan, MN.
To succeed as a data engineer in today’s fast-paced workplace, you must have a strong understanding of programming, automation and scripting, data stores, data processing techniques, and more. In this comprehensive 5-day Data Engineering bootcamp, you will gain foundational data engineering skills such as disseminating and repairing data with Python and Spark SQL while crafting visualizations for outcomes and learning how to build production-ready distributed data infrastructure.
With a combination of theoretical learning modules and hands-on labs, you will be able to gain essential skills you can put into action in the workplace right away. With a growing demand for data engineers in all industries, this highly rated bootcamp is the best place to enhance your skill sets.
Data Engineering & Data Manipulation will be heavily weighted; with some understanding of Data Science.
Topics
Objectives
Audience
This Data Engineer Bootcamp training is targeted to Data Engineers
Lab 1. Data Availability and Consistency
Lab 2. A/B Testing Data Engineering Tasks Project
Lab 3. Learning the Databricks Community Cloud Lab Environment
Lab 4. Python Variables
Lab 5. Dates and Times
Lab 6. The if, for, and try Constructs
Lab 7. Understanding Lists
Lab 8. Dictionaries
Lab 9. Sets
Lab 10. Tuples
Lab 11. Functions
Lab 12. Functional Programming
Lab 13. File I/O
Lab 14. Using HTTP and JSON
Lab 15. Random Numbers
Lab 16. Regular Expressions
Lab 17. Understanding NumPy
Lab 18. A NumPy Project
Lab 19. Understanding pandas
Lab 20. Data Grouping and Aggregation
Lab 21. Repairing and Normalizing Data
Lab 22. Data Visualization and EDA with pandas and seaborn
Lab 23. Correlating Cause and Effect
Lab 24. Learning PySpark Shell Environment
Lab 25. Understanding Spark DataFrames
Lab 26. Learning the PySpark DataFrame API
Lab 27. Data Repair and Normalization in PySpark
Lab 28. Working with Parquet File Format in PySpark and pandas
Lab 1. Data Availability and Consistency
Lab 2. A/B Testing Data Engineering Tasks Project
Lab 3. Learning the Databricks Community Cloud Lab Environment
Lab 4. Python Variables
Lab 5. Dates and Times
Lab 6. The if, for, and try Constructs
Lab 7. Understanding Lists
Lab 8. Dictionaries
Lab 9. Sets
Lab 10. Tuples
Lab 11. Functions
Lab 12. Functional Programming
Lab 13. File I/O
Lab 14. Using HTTP and JSON
Lab 15. Random Numbers
Lab 16. Regular Expressions
s:Lab 17. Understanding NumPy
Lab 18. A NumPy Project
Lab 19. Understanding pandas
Lab 20. Data Grouping and Aggregation
Lab 21. Repairing and Normalizing Data
Lab 22. Data Visualization and EDA with pandas and seaborn
Lab 23. Correlating Cause and Effect
Lab 24. Learning PySpark Shell Environment
Lab 25. Understanding Spark DataFrames
Lab 26. Learning the PySpark DataFrame API
Lab 27. Data Repair and Normalization in PySpark
Lab 28. Working with Parquet File Format in PySpark and pandas
Some working experience in any programming language; the students will be introduced to programming in Python. Basic understanding of SQL and data processing concepts, including data grouping and aggregation.