Certificate Associate in Data Science – Programming Basics
Computers are very powerful tools, but unfortunately, they can’t think for themselves. So they need to be told everything. They need to be told how to perform a task, how to evaluate a condition to decide which path to follow, how to handle data that comes from a device such as the network or a disk, and how to react when something unforeseen happens, say, something is broken or missing.
Good code is short, fast, elegant, easy to read and understand, simple, easy to modify and extend, easy to scale and refactor, and easy to test. It takes time to be able to write code that has all these qualities at the same time, but the good news is that you’re taking the first step towards it at this very moment by exploring this course.
Python
Python has emerged over the last couple decades as a first-class tool for scientific computing tasks, including the analysis and visualization of large datasets. This may have come as a surprise to early proponents of the Python language: the language itself was not specifically designed with data analysis or scientific computing in mind. The usefulness of Python for data science stems primarily from the large and active ecosystem of third-party packages: NumPy for manipulation of homogeneous array-based data, Pandas for manipulation of heterogeneous and labeled data, SciPy for common scientific computing tasks, Matplotlib for publication-quality visualizations, IPython for interactive execution and sharing of code, Scikit-Learn for machine learning, and many more tools that will be mentioned in the following pages.
Some features of the core Python language are more important for data analysis than others. In this course, you’ll look at the most essential of them: string functions, data structures, list comprehension, counters, file and web functions, regular expressions, globbing, and data pickling. You’ll learn how to use Python to extract data from local disk files and the Internet, store them into appropriate data structures, locate bits and pieces matching certain patterns, and serialize and de-serialize Python objects for future processing. However, these functions are by no means specific to data science or data analysis tasks and are found in many other applications.
Benefits of learning Python
- Python is an extremely simple language to read and write, even if you’ve never coded before
- It is one of the most common languages, both in production and in the academic setting (one of the fastest growing, as a matter of fact)
- The language’s online community is vast and friendly. This means that a quick Google search should yield multiple results of people who have faced and solved similar (if not exactly the same) situations
- Python has prebuilt data science modules that both the novice and the veteran data scientist can utilize
Learning Objectives
Upon completion of the course, participants should be able to:
- Explain what does interpreted mean in the context of python programming
- Write python code to perform arithmetic operations
- Import and use libraries, modules and packages related to data science
- Create and access data structures to store relevant data
- Perform iterations on containers to change container values
- Apply conditionals for decision making
- Explain the concept of Big O notation
- Apply functional programming concepts such as map, filter, reduce and lambda on data sets
Who should attend
Data Analysts, Data Engineers, Data Science Enthusiasts, Business Analysts, Project Managers
Prerequisite
Foundational certificate in Big Data/ Data Science
This course is meant for anyone who are comfortable developing applications in Python, and now want to enter the world of data science or wish to build intelligent applications. Aspiring data scientists with some understanding of the Python programming language will also find this course to be very helpful. If you are willing to build efficient data science applications and bring them in the enterprise environment without changing your existing python stack, this course is for you
Delivery Method
Mix of Instructor-led, case study driven and hands-on for select phases
H/w, S/w Reqd
Python, Pandas, Numpy, System with at least 2GB RAM and a Windows /Ubuntu/Mac OS X operating system
Duration
24 Hours (2 days Instructor led + 8 hours online learning)
- Course Name:Certificate Associate in Data Science – Programming Basics
- Location:Singapore
- Duration:2 days classroom + 8 hours online
- Exam Time: 60 minutes
- Course Price: Call for price
- Minimum requirements: Foundational Certificate in Programming
Course contents
# | Topic | Method of Delivery |
---|---|---|
Day 1 | ||
1 |
Chapter 1 – Python Programming Language Syntax 1.1 Comments 1.2 End-of-Line 1.3 Semicolon 1.4 Indentation 1.5 White-space Within Lines 1.5 Parentheses |
Instructor Led |
2 |
Chapter 2 – Operators 2.1 Arithmetic Operations 2.2 Bitwise Operations 2.3 Assignment Operations 2.4 Comparison Operations 2.5 Boolean Operations 2.6 Identity and Membership Operators |
Instructor Led |
Case study |
Hands-on session | |
Day 2 | ||
3 |
Chapter 3 – Data Types 3.1 Lists 3.2 Tuples 3.3 Dictionaries 3.4 Sets 3.5 More Specialized Data Structures |
|
4 |
Chapter 4 – Control Flow 4.1 Runtime Errors 4.2 Catching Exceptions: try and except 4.3 Raising Exceptions: raise 4.4 Diving Deeper into Exceptions 4.5 try…except…else…finally |
Instructor Led |
5 |
Chapter 5 – Iterators 5.1 List Comprehensions 5.2 Generators 5.3 Modules and Packages |
Instructor Led |
6 |
Chapter 6 – String Manipulation and Regular Expression 6.1 Simple String Manipulation in Python 6.2 Format Strings 6.3 Flexible Pattern Matching with Regular Expressions |
Instructor Led |
Case Study |
Hands–on session | |
Assignment |
Online Self paced |
Certification
- Certificate Title: Certificate Associate in Data Science – Programming Basics
- Certificate Awarding Body: ITPACS
About ITPACS
Information Technology Professional Accreditations and Certifications Society (ITPACS) is a non-profit organization focused on improving technology skills for the future. ITPACS offers associate level, professional level and leader certifications across 6 domains including data science, web development, mobile development, cyber security, IoT and blockchain. Applicants have to go through a exam eligibility process demonstrating their experience.
Eligibility
The Associate certification is catered to individuals with less than 1 year working experience in the field. This is ideal for newcomers starting out in the profession or those seeking to make an entry into the profession. Applicants are required to have completed the application process prior to taking the exam.
Exam
- Exam Format: Closed-book format.
Questions: 30 multiple choice questions, coding exercises
Passing Score: 65%
Exam Duration: 60 minutes
Proctored
- Exam needs to be taken within 12 months from the exam voucher issue date
Data Science
Data science is not a single science as much as it is a collection of various scientific disciplines integrated for the purpose of analyzing data. These disciplines include various statistical and mathematical techniques, including:
- Computer science
- Data engineering
- Visualization
- Domain-specific knowledge and approaches
With the advent of cheaper storage technology, more and more data has been collected and stored permitting previously unfeasible processing and analysis of data. With this analysis came the need for various techniques to make sense of the data. These large sets of data, when used to analyze data and identify trends and patterns, become known as big data.
The process of analyzing big data is not simple and evolves to the specialization of developers who were known as data scientists. Drawing upon a myriad of technologies and expertise, they are able to analyze data to solve problems that previously were either not envisioned or were too difficult to solve.
The various data science techniques that we will illustrate have been used to solve a variety of problems. Many of these techniques are motivated to achieve some economic gain, but they have also been used to solve many pressing social and environmental problems. Problem domains where these techniques have been used include finance, optimizing business processes, understanding customer needs, performing DNA analysis, foiling terrorist plots, and finding relationships between transactions to detect fraud, among many other data-intensive problems.
Data mining is a popular application area for data science. In this activity, large quantities of data are processed and analyzed to glean information about the dataset, to provide meaningful insights, and to develop meaningful conclusions and predictions. It has been used to analyze customer behavior, detecting relationships between what may appear to be unrelated events, and to make predictions about future behavior.
Machine learning is an important aspect of data science. This technique allows the computer to solve various problems without needing to be explicitly programmed. It has been used in self-driving cars, speech recognition, and in web searches. In data mining, the data is extracted and processed. With machine learning, computers use the data to take some sort of action.