Cloud + Machine-to-Machine = Disruption of Things: Part 1

Separation of Data Collection, Processing, Interface, and Control

The use of cloud computing means that data collection, processing, interface, and control can be separated and distributed to the most appropriate resource and device. Current M2M implementations combine data collection, processing, interface, and control. Either chips in sensor bodies or an onsite laptop or desktop PC tied within a local mesh network perform the data processing and determine the control logic.

Once data collection and processing moves to the cloud, however, most current limitations disappear. One of the more obvious ones is that data limits go away. Storing data in the cloud means that the data buffers within devices (whether its 500 or even 5,000 data points) no longer matter. Cloud storage is near limitless and so historic data can be saved for as long as its deemed valuable.

Not only is there access to super-fast processors, if there are server or storage bottlenecks, these can be addressed by on-demand launching of more servers or horizontally scaling storage. Using the cloud and dynamic languages and frameworks, the cost of ownership goes way down and the limitations go away.

The data can be used for showing readings, performance, or status for the last day, week, month, and even year. Individual nodes can be inspected as well as grouped together with similar data from other devices. Analysis can be performed quickly on specific groupings and filters – whether it’s product line, region, demographic, or application use. The consumer Web understands the value of data and the many permutations that analysis can take. Once M2M data moves to the cloud, M2M companies begin to have the same realizations.

Applications are also not restricted by tiny processors, low-power consumption, and special purpose programming languages. Processing in the cloud brings best-of-breed programming capabilities. These include widely popular programming languages and frameworks, flexible data structures, and extensive algorithm libraries.

Not only is there access to super-fast processors, if there are server or storage bottlenecks, these can be addressed by on-demand launching of more servers or horizontally scaling storage. Using the cloud and dynamic languages and frameworks, the cost of ownership goes way down and the limitations go away. There’s also a huge increase in the speed of product development.

Lastly, interfaces can move to Web browsers, wallboards, mobile phones and tablets, eliminating the need for either having screens a part of every devices or local computers permanently a part of installations. Medical devices no longer have to come with their own monitors. Separating the data input from the processing from the screen readout not only means lower costs (less components to the devices) but also easier upgrading of diagnostics and far better visualization capabilities.

An MRI or sonogram sent to the cloud, digitally refined using the latest algorithms distributed across multiple machines, and presented on an iPad or an HDTV screen is going to look a lot better and be more insightful than if displayed on existing monitors, no matter how new the device is. Separating the components and putting the data and processing in the cloud allows devices to keep getting smarter while not necessarily becoming more complex.

Data Virtualization

Data storage is one of the biggest advantages of using the cloud for M2M applications. The cloud not only offers simple and virtual ways to run applications, it also offers simple and virtual ways to store data. Cloud infrastructure companies are increasingly offering simple-to-use services to provision and maintain databases. These services even extend to offering databases as a service – meaning offering expandable data storage at the end of an IP address all the while masking or eliminating the management of servers, disks, backup and other operational issues. Examples include Amazon’s SimpleDB and RDS service and Salesforce’s Database.com offering.

Once transmitted to the cloud, data can be stored, retrieved and processed without having to address many of the underlying computing resources and processes traditionally associated with databases. For M2M applications, this type of virtualized data storage service is ideal.

Being able to seamlessly handle continuous streams of structured data from sensor sources is one of the more fundamental requirements for any distributed M2M application. As an example, Appoxy processes the data streams from network adapters from Plaster Networks, a fast-growing leader in the IP-over-power line space. Status inputs are sent by the adapters continuously to Plaster which runs its applications on one of the top cloud infrastructure providers. Appoxy processes these for use with the user dashboard running on the Web.

This console provides insights to users on the status and performance of the adapters and their networks (allowing users to see whether they are getting optimal performance out of their devices and networks). The data also provides valuable diagnostic information to Plaster, dramatically reducing support calls and improving device usage and customer satisfaction. The information is also invaluable for product development. Data on the performance of new products and features can be assessed in real-time, providing insights that would otherwise be unattainable from devices in field.

This type of smart device marks the beginning of the trend. It’s a fair bet that all network devices will become smart and cloud-aware. Followed by all vehicles, manufacturing equipment and almost all machines of any substance.

The types of data stores available include SQL, NoSQL and block or file storage. SQL is useful for many application needs but the massive, parallel and continuous streams of M2M data lends itself well to the use of NoSQL approaches. These data stores operate by using key-value associations which allows for a flatter non-relational form of association. NoSQL databases can work without fixed table schemes, which makes it easy to store different data formats as well as evolve and expand formats over time.

NoSQL databases are also easy to scale horizontally. Data is distributed across many servers and disks. Indexing is performed by keys that route the queries to the datastore for the range that serves that key. This means different clusters respond to requests independently from other clusters, greatly increasing throughput and response times. Growth can be accommodated by quickly adding new servers, database instances and disks and changing the ranges of keys.

The NoSQL approach play well in M2M applications. Data for sensors or groups of sensors can be clustered together and accessed by an ever-expanding set of processes without adversely affecting performance. If a datastore gets too large or has too many requests, it can be split into smaller chunks or “shards.” If there are many requests on the same data, it can be replicated into multiple data sets, with each process hitting different shards lessening the likelihood of request collisions.

Discuss