Quantitative Data Management, Analysis, & Visualization With Python Course




22nd to 26th Jan 2024


19th to 23rd Feb 2024


25th to 29th March 2024


22nd to 26th April 2024


20th to 24th May 2024


24th to 28th June 2024


 22nd to 26th July 2024


26th to 30th Aug 2024


23rd to 27th Sept 2024


21st to 25th Oct' 2024


25th-29th Nov' 2024


16 to 20th Dec 2024


This training on data analysis using PYTHON helps one to make use of the abilities of Python when analyzing big data, create effective visualizations, and buy efficient machine learning algorithms. The course is created for both novices who possess basic programming knowledge or developers looking to learn more on data Science & analysis of huge amount of data.

Objectives of Training on Data Analysis Using Python

• interactive dynamic visualizations

• K Means Clustering, Linear Regression, and Logistic Regression

• Random Decision and Forest Trees

• SciKit Learn for Machine Learning Tasks

• Neural Networks

• Support Vector Machines

• Report writing on the research

• Research Design

• Python for Data Machine and Science

• Implement Machine Learning Algorithms

• Numbly for Numerical Data

• Pandas for Data Analysis

• Spark for Big Data Analysis

• Matplotlib for Python Plotting


This's a basic program targeting participants with an elementary understanding of Statistics coming from Agriculture, Economics, Livelihoods, and Food Security, Nutrition, Education, Public or medical health experts among individuals that currently have a little statistical knowledge, but want to be conversant with the principles and uses of statistical modeling by using Python.

Course Topics

Module1: Basic statistical terms and concepts

  o   Introduction to statistical concepts

o   Descriptive Statistics

o   Inferential statistics

Module 2: Research Design

  o   The role and purpose of research design

o   Types of research designs

o   The research process

o   Which method to choose?

o   Exercise: Identify a project of choice and developing a research design

Module 3: Survey Planning, Implementation and Completion

  o   Types of surveys

o   The survey process

o   Survey design

o   Methods of survey sampling

o   Determining the Sample size

o   Planning a survey

o   Conducting the survey

o   After the survey

o   Exercise: Planning for a survey based on the research design selected

Module 4: Introduction to Phython

  o   Course Intro

o   Setup

o   Installation Setup and Overview

o   IDEs and Course Resources

o   iPython/Jupyter Notebook Overview

Module 5: Learning Numpy

  o   Intro to numpy

o   Creating arrays

o   Using arrays and scalars

o   Indexing Arrays

o   Array Transposition

o   Universal Array Function

o   Array Processing

o   Array Input and Output

Module 6: Intro to Pandas

  o   DataFrames

o   Index objects

o   Reindex

o   Drop Entry

o   Selecting Entries

o   Data Alignment

o   Rank and Sort

o   Summary Statistics

o   Missing Data

o   Index Hierarchy

Module 7: Working with Data

  o   Reading and Writing Text Files

o   JSON with Python

o   HTML with Python

o   Microsoft Excel files with Python

o   Merge and Merge on Index

o   Concatenate and Combining DataFrames

o   Reshaping, Pivoting and Duplicates in Data Frames

o   Mapping,Replace,Rename Index,Binning,Outliers and Permutation

o   GroupBy on DataFrames

o   GroupBy on Dict and Series

o   Splitting Applying and Combining

o   Cross Tabulation

Module 8: Big Data and Spark with Python

  o   Welcome to the Big Data Section!

o   Big Data Overview

o   Spark Overview

o   Local Spark Set-Up

o   AWS Account Set-Up

o   Quick Note on AWS Security

o   EC2 Instance Set-Up

o   SSH with Mac or Linux

o   PySpark Setup

o   Lambda Expressions Review

o   Introduction to Spark and Python

o   RDD Transformations and Actions

Module 9: Data Visualization

  o   Installing Seaborn

o   Histograms

o   Kernel Density Estimate Plots

o   Combining Plot Styles

o   Box and Violin Plots

o   Regression Plots

o   Heatmaps and Clustered Matrices

Module 10: Data Analysis

  o   Linear Regression

o   Support Vector

o   Decision Trees and Random Forests

o   Natural Language Processing

o   Discrete Uniform Distribution

o   Continuous Uniform Distribution

o   Binomial Distribution

o   Poisson Distribution

o   Normal Distribution

o   Sampling Techniques

o   T-Distribution

o   Hypothesis Testing and Confidence Intervals

o   Chi Square Test and Distribution

Module 11: Report writing for surveys, data dissemination, demand and use

  o   Writing a report from survey data

o   Communication and dissemination strategy

o   Context of Decision Making

o   Improving data use in decision making

o   Culture Change and Change Management

o   Preparing a report for the survey, a communication and dissemination plan and a demand and use strategy.

o   Presentations and joint action planning


  • All the participants should be conversant with the English language.
  • All our courses involve a mix of one on one presentations, web based tutorials, group discussions, and practical exercises.
  • All courses can be tailor made and adjusted to meet the client’s needs.
  • We have a team of professional experts who work as professionals and trainers in their respective fields.
  • Each participant will get an Uphilos Consultancy certificate, upon completion of each course.
  • All our training sessions are held at Uphilos Center. For groups above five people we can train at any location in Kenya, above ten people any location within East Africa, and above twenty people at any location as per the client’s specifications.
  • Participants will be responsible for their travel, dinner, airport transfers, and other personal expenses.
  • Accommodation will be arranged upon request at a discounted price.
  • All payments should be made two weeks prior to the training for better logistics. Proof of payment should be sent to [email protected]
error: !!
Scroll to Top
Open chat
Can we help you?