Wednesday, 13 April 2016

Internet of Things (IoT) is catching up!!

Internet of Things (IoT) is catching up!!

According to latest research report by Gartner, Inc, there will be nearly 26 billion devices on Internet by 2020. Gartner, Inc. forecasts that 6.4 billion connected things will be in use worldwide in 2016, and will reach 20.8 billion by 2020[1]. They also estimate that the Internet of Things (IoT) will support total services spending of $235 billion in 2016. According to ABI research, the revenues from integrating, storing, analyzing, and presenting IoT data will reach USD 5.7 billion by 2016. From FitBits to Apple Watches, wearable tech and IoT will explode into the marketplace soon[2].

Yet another estimate by Business Insider, 2016, reports that there will be 34 billion devices connected to the internet by 2020. IoT devices will account for 24 billion, while traditional computing devices (e.g. smartphones, tablets, smartwatches, etc.) will comprise 10 billion. Nearly $6 trillion will be spent on IoT solutions over the next five years[3].

Internet is network of networks, and a network comprises of connected devices and hence Internet is of Internet through tiny embedded sensors and computing power. Internet of Things, Internet of Everything (by CISCO), Smart Things (by IBM) are the terminologies coined by different companies for same internet of things - many connected devices. Because of the popularity and advancement in semiconductors, web, wireless, mobile and security technologies, anyTHING and everyTHING can be connected on the Internet. The THING on the INTERNET can be your refrigerator, your house security system, surveillance camera of your child’s play home or your remote office, RFID systems or any everyday THINGs. Hence the name IOT - The Internet of Things (IoT) is nothing but the phenomenon of every conceivable device getting connected to the internet.

The idea of IOT is that not only that your computer and your smartphone can talk to each other on the Internet, but also to all the things that are around you which can be on the Internet. From your connected homes to connected refrigerators, cars, trains and roads to devices that can track an individual’s movement and even your body pulse rates, heart beats, the calories you burn, your locations, etc. These data can be “pushed” to numerous BIG DATA applications to solve everyday problems that potentially lead to a new and improved customer experience.

The THINGS on the internet can be controlled, observed, or analysed by other smart devices on the internet, if programmed properly. IOT environment comprises of smart devices which are Always connected to each other Anywhere at Anytime (3As). These devices can be configured to constantly send their data to a cloud server for further analysis that can help in making decisions and business actions. The real value is in the analysis of the data and particularly how this analysis can lead to predicting the future.

What makes a THING to be on the Internet?

1. Should be an IP device (IPV6 or IPV4).

2. Should have a unique identification to start communicating.

3. Should be able to send configuration, events, and sensory data over the Internet.

4. Should be able to present its identity to advanced front end mobile applications or web applications so that the applications can extract data from these devices. The data analysis will then help in making automatic decisions of controlling and configuring these devices.

IOT can play a major role in the utilities, oil & gas, manufacturing, transportation, retail and other business sectors as the THIINGS that are necessary for these businesses can send enough data to a remote server for arriving at informed decisions. The data sent by the IOT, can be used to improve the asset utilization, for improved tracking of the devices, and for providing real-time insights.

Imagine, you have a THING inside your car which could sense the temperature, knows your schedule and starts the car engine automatically 10 minutes before you leave office in Chicago winter time. How about controlling your home thermostat and security system while you are away from home? Pretty soon, all the home appliances will be Wi-Fi and IOT enabled and let you make informed decisions about when to adjust your thermostat, or when the food in the fridge has to be thrown out or when you have to pick up your milk, or when your FEDEX package will reach your home so that you can sign and collect the parcel.

Though IoT has many benefits, it also has several challenges. The main challenges include – Signaling, Security, Power consumption and Bandwidth.

References:
1.  “Gartner Says 6.4 Billion Connected "Things" Will Be in Use in 2016, Up 30 Percent From 2015”, STAMFORD, Conn., November 10, 2015, Last visited on March 15th, 2016. http://www.gartner.com/newsroom/id/3165317

2.  Market for IoT Analytics to Reach US$5.7 Billion in 2015, with Startups Driving the Innovation, London, United Kingdom - 13 Jan 2015, Last visited March 16th 2016, https://www.abiresearch.com/press/market-for-iot-analytics-to-reach-us57-billion-in-/

3.  Here are IoT trends that will change the way businesses, governments, and consumers interact with the world, John Greenough and Jonathan Camhi, Mar. 10, 2016, 9:19 AM, http://www.businessinsider.com/top-internet-of-things-trends-2016-1?IR=T


Wednesday, 11 March 2015

Caret R Package - classification and regression training

Caret R Package - classification and regression training
(http://topepo.github.io/caret/index.html)

The caret package (short for classification and regression training) contains functions to streamline the model training process for complex regression and classification problems. The caret package is a set of functions that attempt to streamline the process for creating predictive models. The package contains tools for:
  • data splitting
  • pre-processing
  • feature selection
  • model tuning using resampling
  • variable importance estimation
Following are the steps to install caret package (it has many dependencies).

Install "‘minqa’, ‘RcppEigen’, ‘scales’, ‘lme4’, ‘ggplot2’, ‘reshape2’, ‘BradleyTerry2’" one by one.

Step 1: install.packages (“minqa”)
Step 2: install.packages (“RcppEigen”)
Step 3: install.packages(“lme4”)
Step 4: install.packages(“ggplot2”)
Step 5: install.packages(“reshape2”)
Step 6: install.packages(“BradleyTerry2”)
Step 7:  install.packages("caret", dependencies = c("Depends", "Suggests"))

Example of predicting using “glm” method:
library(caret)
library(kernlab)
data(spam)
inTrain <- createDataParition(y=spam$type,p=0.75,list=FALSE)   #partition 75% training and 25% testing
training <- spam[inTrain, ]
testing <- spam[-inTrain, ]
> dim(training)
[1] 3451   58
> dim(testing)
[1] 1150   58
>

> set.seed(1234)
> fit<-train(type~., data=training, method="glm")
Loading required namespace: e1071
There were 26 warnings (use warnings() to see them)
> fit
Generalized Linear Model

3451 samples
  57 predictor
   2 classes: 'nonspam', 'spam'

No pre-processing
Resampling: Bootstrapped (25 reps)

Summary of sample sizes: 3451, 3451, 3451, 3451, 3451, 3451, ...

Resampling results

  Accuracy   Kappa      Accuracy SD  Kappa SD 
  0.9207482  0.8330317  0.008444636  0.01755059


>
> fit$finalModel
 
Call:  NULL
 
Coefficients:
      (Intercept)               make  
       -1.515e+00         -3.393e-01  
          address                all  
       -1.482e-01          9.183e-02  
            num3d                our  
        2.531e+00          5.661e-01  
             over             remove  
        4.999e-01          2.612e+00  
         internet              order  
        5.661e-01          8.957e-01  
             mail            receive  
        9.189e-02         -2.957e-01  
             will             people  
       -1.321e-01         -2.583e-01  
           report          addresses  
        1.068e-01          1.121e+00  
             free           business  
        9.468e-01          1.080e+00  
            email                you  
        1.910e-02          8.164e-02  
           credit               your  
        1.387e+00          2.326e-01  
             font             num000  
        3.465e-01          3.525e+00  
            money                 hp  
        1.376e+00         -1.982e+00  
              hpl             george  
       -1.369e+00         -9.258e+00  
           num650                lab  
        9.965e-01         -2.143e+00  
             labs             telnet  
       -6.141e-01         -1.234e-01  
           num857               data  
        2.369e+00         -9.245e-01  
           num415              num85  
        1.111e+00         -2.231e+00  
       technology            num1999  
        7.566e-01          8.572e-02  
            parts                 pm  
       -5.501e-01         -1.005e+00  
           direct                 cs  
       -2.563e-01         -4.692e+01  
          meeting           original  
       -2.173e+00         -9.787e-01  
          project                 re  
       -1.610e+00         -7.536e-01  
              edu              table  
       -1.483e+00         -3.167e+00  
       conference      charSemicolon  
       -4.491e+00         -1.623e+00  
 charRoundbracket  charSquarebracket  
        1.356e-01         -6.342e-01  
  charExclamation         charDollar  
        2.497e-01          5.745e+00  
         charHash         capitalAve  
        2.223e+00         -1.661e-03  
      capitalLong       capitalTotal  
        8.800e-03          7.263e-04  
 
Degrees of Freedom: 3450 Total (i.e. Null);  3393 Residual
Null Deviance:     4628 
Residual Deviance: 1297        AIC: 1413

PREDICTIONS:

> predictions<- predict(fit, newdata=testing)
> predictions
   [1] spam    spam    spam    spam    spam   
   [6] spam    nonspam spam    spam    spam   
  [11] spam    spam    spam    spam    spam   
  [16] spam    nonspam spam    spam    spam   
  [21] spam    spam    spam    spam    spam   
  [26] nonspam spam    spam    spam    spam   
  [31] nonspam spam    spam    spam    spam   
  [36] spam    spam    spam    spam    spam   
  [41] spam    spam    spam    spam    spam   
  [46] spam    spam    spam    spam    spam   
  [51] spam    spam    spam    nonspam spam   
  [56] spam    spam    spam    spam    spam   
  [61] spam    spam    spam    spam    spam   
  [66] spam    spam    spam    spam    spam   
  [71] spam    spam    spam    spam    nonspam

> confusionMatrix(predictions,testing$type)
Confusion Matrix and Statistics

          Reference
Prediction nonspam spam
   nonspam     659   50
   spam         38  403
                                         
               Accuracy : 0.9235         
                 95% CI : (0.9066, 0.9382)
    No Information Rate : 0.6061          
    P-Value [Acc > NIR] : <2e-16         
                                         
                  Kappa : 0.839          
 Mcnemar's Test P-Value : 0.241          
                                         
            Sensitivity : 0.9455          
            Specificity : 0.8896         
         Pos Pred Value : 0.9295         
         Neg Pred Value : 0.9138         
             Prevalence : 0.6061         
         Detection Rate : 0.5730         
   Detection Prevalence : 0.6165          
      Balanced Accuracy : 0.9176         
                                         
       'Positive' Class : nonspam        
                                         
>