Trending Technology Machine Learning, Artificial Intelligent, Block Chain, IoT, DevOps, Data Science

Recent Post

Codecademy Code Foundations

Search This Blog

Naive Bayes in Machine Learning


Dataset of patients who had undergone surgery for breast cancer.
Features of dataset:
  • Age - Age of patient at time of operation.
  • Year - Patient's year of operation (year - 1900).

  • Nodes - Number of positive axillary nodes detected.
  • Class(Survived):
    1 - the patient survived 5 years or longer
    2 - the patient died within 5 year
  • Given the details of the patient we need to predict whether the patient survived or not.  

    Import required libraries

    
    # For mathematical calculation
    import numpy as np
    
    # For handling datasets
    import pandas as pd
    
    # For plotting graphs
    from matplotlib import pyplot as plt
    
    # Import the sklearn library for Naive bayes
    from sklearn.naive_bayes import GaussianNB
    

    Import dataset

    
    # Import the csv file
    df = pd.read_csv('data.csv')
    
    print df.head()
    '''
    Output:
       Age  Year  Nodes  Survived
    0   30    64      1         1
    1   30    62      3         1
    2   30    65      0         1
    3   31    59      2         1
    4   31    65      4         1
    '''
    

    Plot the classes against features.

    
    # We plot the data to see dependency of any 
    # feature on the class
    plt.xlabel('Feature')
    plt.ylabel('Survived') 
    
    X = df.loc[:,'Age']
    Y = df.loc[:,'Survived']
    plt.scatter(X, Y,color='blue',label='Age')
    
    X = df.loc[:,'Year']
    Y = df.loc[:,'Survived']
    plt.scatter(X, Y,color='green',label='Year')
    
    X = df.loc[:,'Nodes']
    Y = df.loc[:,'Survived']
    plt.scatter(X, Y,color='red',label='Nodes')
    
    plt.legend(loc=4, prop={'size': 7})
    plt.show()
    

     Prepare data for training

    
    # Prepare the training set
    X = df.loc[:,'Age':'Nodes']
    Y = df.loc[:,'Survived']
    

    Train the model

    
    clf = GaussianNB()
    
    # Train the model
    clf.fit(X,Y)
    

    Test the model

    
    # Test the model(returns the class)
    prediction = clf.predict([[12,70,12],
                              [13,20,13]])
    
    print prediction
    '''
    Output:
    [1 2]
    '''

No comments:

Post a Comment

Popular Articles