ARTIFICIAL INTELLIGENCE AND PYTHON DAY 1 STANLEY LIANG, LASSONDE - - PowerPoint PPT Presentation

artificial intelligence and python
SMART_READER_LITE
LIVE PREVIEW

ARTIFICIAL INTELLIGENCE AND PYTHON DAY 1 STANLEY LIANG, LASSONDE - - PowerPoint PPT Presentation

ARTIFICIAL INTELLIGENCE AND PYTHON DAY 1 STANLEY LIANG, LASSONDE SCHOOL OF ENGINEERING, YORK UNIVERSITY WHAT IS PYTHON An interpreted high-level programming language for general-purpose programming. Python features a dynamic type


slide-1
SLIDE 1

ARTIFICIAL INTELLIGENCE AND PYTHON

DAY 1 STANLEY LIANG, LASSONDE SCHOOL OF ENGINEERING, YORK UNIVERSITY

slide-2
SLIDE 2

WHAT IS PYTHON

  • An interpreted high-level programming language for general-purpose

programming.

  • Python features a dynamic type system and automatic memory management.
  • Python supports multiple programming paradigms, including object-oriented,

imperative, functional and procedural.

  • Python as has a large and comprehensive standard library and multiple

packages for different purposes.

  • In this course, we will use the Anaconda distribution with various Python

tools to implement AI and machine learning tasks

slide-3
SLIDE 3

INSTALL SOFTWARE

  • Download and install Python 3.6 from https://www.python.org/downloads/
  • Visit https://anaconda.org/anaconda/python
  • For PC with Windows 10: open a command line and type: conda install -c anaconda python
  • For Mac, download from: https://www.anaconda.com/download/#macos
  • Open Anaconda Navigator and launch Jupyter
  • Jupyter is an interactive IDE (integrated development environment) for Python
  • Other choice: PyCharm,

Visual Studio, Spyder, etc.

slide-4
SLIDE 4

BASIC TYPE AND ASSIGNMENT

  • String - unlike C, Python has no char
  • Number - unlike C, Python has no int or float / double
  • Boolean -True / False, capitalize the first letter
  • Multiple Assignment
  • The null value – None, not null
slide-5
SLIDE 5

FLOW CONTROL

  • Be careful of the indentation
  • Branching: If-Then-Else
  • Iteration: For-Loop, while-Loop, No native do-while-loop
slide-6
SLIDE 6

DATA STRUCTURE

  • Tuple - read-only collections of items
  • List - use the square bracket notation and can be index using array notation
  • Dictionary - are mappings of names to values, like key-value pairs. Note the use of the

curly

  • Summary
  • Tuple uses ( ), List uses [ ], Dictionary uses { } with ‘ ’ for the keys
  • To subset, always use [ ]
slide-7
SLIDE 7

FUNCTION IN PYTHON

  • Function in Python is initiated by the keyword “def”, i.e. define
  • Do not use “func” or “function” as the keyword, but remember use the parenthesis “( )”

as the sign of a “function call”

  • The biggest tricky thing with Python is the whitespace.
  • Ensure that you have an empty new line after indented code.
  • A function can have one or more arguments, or have no arguments, but don’t need to

return a type

slide-8
SLIDE 8

THE NUMPY

  • NumPy provides the foundation data structures and operations for SciPy
  • These are arrays (ndarrays) that are efficient to define and manipulate
  • Before use, you need to import the numpy package
  • If use Anaconda, the numpy is installed by default
  • If Python cannot find it, use pip, or conda to install from commandline
  • python -m pip install --user numpy scipy matplotlib ipython jupyter pandas sympy nose
slide-9
SLIDE 9

DATA VISUALIZATION

  • In Python, we can visualize the data by the Matplotlib package
  • Matplotlib can be used for creating plots and charts
  • The general procedure to use Matplotlib
  • import matplotlib.pyplot as plt
  • Call a plotting function such as plt.plot( ) or plt.scatter, etc.
  • Call the plot property configuration functions such as label, lim, etc.
  • Call title, text, etc. to add notations
  • Visualize the configured plot by show( )
slide-10
SLIDE 10

PANDAS AND DATAFRAME

  • Pandas provides data structures and functionality to quickly manipulate and

analyze data

  • The two important element in Pandas
  • Series - a one dimensional array of data where the rows are labeled using a time axis
  • Subset a Series by index
  • DataFrame - a multi-dimensional array where the rows and the columns can be labeled
  • Subset a DataFrame by columns
  • Subset a DataFrame by rows
slide-11
SLIDE 11

LOAD DATA FROM A CSV FILE

  • Before starting machine learning, you should load your data into Python
  • The most common format for machine learning data is CSV files
  • Three ways to load a CSV into Python
  • Load CSV Files with the Python Standard Library
  • Load CSV Files with NumPy
  • Load CSV Files with Pandas (recommended)
  • CSV from two source
  • Local machine – always use ‘/’ to define the path
  • From a URL – using urllib.request.urlopen or pandas.read_csv
slide-12
SLIDE 12

UNDERSTAND YOUR DATA

  • You must understand your data in order to get the best results
  • Take a peek for a first impression
  • Review the dimension of the dataset
  • Review the data type of the attributes (columns)
  • Summarize the distribution of instances across classes in your dataset
  • Summarize your data using descriptive statistics
  • Understand the relationships in your data using correlations
  • Review the skew of the distributions of each attribute
slide-13
SLIDE 13

VISUALIZE YOUR DATA

  • You must understand your data in order to get the best results from machine learning algorithms.
  • The intuitive way to learn more about your data is to visualize them.
  • Plots for univariate (one variable)
  • Histogram
  • Density Plot
  • Box & Whisker Plot
  • Plot for Multivariate (more than one variable)
  • Correlation Matrix Plot
  • Scatter Plot Matrix
slide-14
SLIDE 14

PREPARE YOUR DATA FOR MACHINE LEARNING

  • Many machine learning algorithms make assumptions about your data
  • Different algorithms requires different data transforms – data preprocessing
  • Prepare the data to best expose the structure of the problem
  • Rescale data
  • Standardize data
  • Normalize data
  • Binarize data
  • The scikit-learn library of Python provides two standard methods for transforming data
  • Fit and Multiple Transform
  • Combined Fit-And-Transform