As of 2012, the size of data sets that was feasible to process in a reasonable amount of time, was limited to exabytes level. Data sets grow in size in part because they are increasingly being gathered by ubiquitous information-sensing mobile devices, aerial sensory technologies (remote sensing), cameras, software logs, microphones, radio-frequency identification readers, wireless sensor networks and other such applications. Limitations due to large data sets are encountered in many areas, including genomics, meteorology, complex physics simulations, biological and environmental research. This article takes a look at how the phenomenon of Big Data has affected the T&M world
‘Big Data’ is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, storage, curation, search, transfer, sharing, analysis and visualisation.
In test and measurement applications, engineers and scientists often end up collecting huge amounts of data every second of every day.
Let us take the Large Hadron Collider (LHC) for example. For every second of its operations, a test engineer has to deal with an additional 40 terabytes of data. (A ‘tera’ is ‘1’ followed by twelve 0’s.) Similarly, every 30 minutes of a Boeing jet engine run add another 10 terabytes of valuable data, which translates to over 640 TB for just a single journey across the Atlantic. Multiply that by more than 25,000 flights each day, and you get an understanding of the enormous amount of data being generated. Suffice to say that the data available today is so huge that there is a whole branch devoted to deciphering it.
Implications of data growth in the design space
The impact of data growth is primarily reflected in a need for faster data storage as well as faster data transfer. This has resulted in the development of ultra-fast storage drives, circuits that can handle gigabit data transfer speeds, faster backplanes for servers and faster networking to handle the increased data. This, in turn, has led to the emergence of the following faster serial standards for data transfer between chips and systems, as well data streaming to storage drives:
SAS-3 (serial-attached SCSI-3). SAS is a point-to-point serial protocol that moves data to and from storage devices. SAS-3 is the latest SAS standard with a bus speed of 12 Gbps.
Fibre channel. It is a high-speed network technology used to connect computers for data storage. It has become a common connection type for storage area networks in enterprise storage. Faster versions of fibre channel are 16G and 28G.
PCI Express Gen 3, Infiniiband, RapidIO and Hypertransport. These are fast buses used for high performance backplanes.
Where does Big Data come in?
Testing and validating the latest ultra-fast buses require high-performance test and measurement systems, which presents a tremendous challenge and business opportunity for test and measurement companies. The typical test needs are for testing the transmitter and the receiver, and also validating the connection medium (cables et al). For transmitter’s physical-layer testing, ultra-high-bandwidth oscilloscopes with low noise floor and low intrinsic jitter are required.
Sanchit Bhatia, digital applications specialist, Agilent Technologies, cites an example: “Fibre Channel 28G requires a 45GHz real-time scope for electrical validation and SAS-3 requires a 20GHz oscilloscope. Moving on to the receiver validation, bit-error-rate testers operating at up to 28G data rate are required. These testers stress the receiver by applying calculated jitter and measure the bit error rate. Protocol validation is also done in these high-speed buses through custom-designed high-performance protocol analysers.”
Big Data opens three interesting areas for test and measurement: “First, test and measurement has so far been limited to labs and manufacturing lines. With Big Data catching fire, the boundaries of time and distance have broken down and test and measurement now happens on the field. Some applications that have caught on very significantly are online condition monitoring of systems (in other words, aggregation of data). Second, remote monitoring, testing and diagnostics of systems deployed in remote locations without physical access have gained a lot of leverage, which means easier access to data. Last, near-infinite computing resources in the cloud provide an opportunity for software to offload computationally heavy tasks. These can be sophisticated image or signal processing or even compilation and development, which, in short, is ‘offloading,’ explains Satish Mohanram, technical marketing manager, National Instruments India.