

序
Human-computer interaction paradigms are poised for a revolutionary transformation.Generally,a computing device comprises input technologies for the users to provide instructions or commands,processing technologies to execute computing tasks according to the user inputs,and output technologies to return the results or responses back to the user.Looking back into the past,the introduction of the mouse and keyboard had a profound impact on the evolution and growth of personal computers.Along with rapid advances in computing,display,and communications technologies,the advent of easy and affordable means to provide user inputs was crucial to the phenomenal global adoption of computers.Similarly,the introduction of remote control devices helped make televisions more convenient and contributed to the widespread adoption around the world.
However,while these input devices helped define the computing and entertainment hubs in the homes and offices,and spurred the development of plethora of useful applications,they also limit the scope of human interfaces by requiring the users to interact with content on the screens indirectly.For example,the user has to drag the computer mouse around on the table,which in turn controls the movement of the digital content on the computer screen.In the real world,we interact with objects directly.When I want to move a book from the table to the shelf,I directly pick it up with my hand and place it to where I want it.Imagine if I had to perform that task by an indirect manipulation scheme,such as having to move around another physical object in space which would dictate the location and movement of the book.It would make the interaction with the objects in the physical world extremely inconvenient and frustrating!
In the recent years,the introduction of touch screen technologies in mobile devices has enabled direct user interactions with the content on displays.The impact of this on new mobile devices and applications has been unprecedented.The natural user interfaces developed using the touch inputs have fueled the creation of a vast number of new applications,and played a key role in the rapid growth of mobile device categories across the globe.With input and interaction schemes that are both easy to use and fun,mobile devices have penetrated all walks of life.While the resulting user experiences are much more compelling than those built based on indirect input devices such as the mouse,touch-screen based user interactions are still limited to a 2D plane.Even though the screens on the devices display 3D graphical content at different virtual planes,the users can only touch the surface of the screen to manipulate the content.
We live in the 3D world,and are used to navigating and manipulating objects in the 3D space.Equipped with a rich set of natural sensors,we see,hear,touch,smell,and taste the world,as well as perceive depth and interact in the 3D environment.For example,we grab an object with our fingers,bring it close to our eyes to have a better look,and then place it back at a distance.We hold a door knob by our hand,twist it by an angle,and then pull or push towards or away from us to open it.When we play a slingshot game,we pull the sling closer to us and then let it go away from us to hurl the object into 3D space.These are just a few examples of activities that we perform in the physical world,where we take 3D interactions in our daily lives for granted.
Implementing such natural interactions between humans and computers requires development and integration of new input technologies.Specifically,to capture 3D spatial information and recognize human actions in the 3D space accurately,the computing device needs to have the ability to sense and understand the 3D world in real time.However,today’s computing devices are only equipped with 2D imaging devices that have originally been developed for capturing 2D pictures and videos.So is it time to add 3D visual sensing technology and depth perception to computing devices?Why does a machine need to“see”and“sense”the 3D world like humans?
Let’s look back into the past again…this time way back,in fact~540 million years back!Fossil records reveal that a period of about 70–80 million years starting at that time went through an exceptionally accelerated pace of diversification of biological organisms in the evolution process,which has been named the Cambrian explosion.There are many theories and debates about what triggered this phenomenon,including environmental as well as developmental factors.Among these,the onset of biological vision system including the ability to sense the world in 3D has been credited to be a key factor that advanced the capabilities of species via natural selection and,later on,the evolution of early mammals.
Similarly,computing devices equipped with real-time 3D visual sensing technologies can be developed to understand the 3D environment around them,interact with humans and each other in much more natural and intuitive ways.Recognizing this,researchers and engineers in academia and industry have intensified research and development of 3D imaging and depth perception technologies in the recent years.However,for the technologies and applications to potentially go pervasive,several significant developments have to be realized.These include miniaturizing the 3D sensor modules such that they could be easily integrated into devices of all form factors,lowering the power consumption for longer battery lives of mobile devices,as well as reducing costs for mainstream adoption.Besides the sensors,other key technologies include 3D computer vision algorithms for enabling real-time understanding and interactions utilizing the 3D information,applications that are built using the new interfaces,and efficient hardware architecture for accelerating the algorithms and applications.
Intel®RealSenseTMTechnology has been developed and introduced to the computing ecosystem towards realizing this vision(www.intel.com/realsense).RealSense offerings include small form-factor,low-power,and low-cost 3D sensing modules with onboard hardware acceleration of depth algorithms,and software development kits(SDK)incorporating a set of sophisticated middleware libraries with easy-to-use application programming interfaces(API).
The spectrum of new applications enabled by RealSense technologies and devices are endless.Laptops,all-in-one(AIO)computers,and 2:1/tablet devices with integrated RealSense cameras are already available from a number of leading computer makers and ramping in the market.The applications include user authentication via accurate face recognition,video chats and live streaming with virtual“green screen”effects via background segmentation,immersive gaming and application control with 3D gesture inputs,3D scan of humans,objects,and scenes,virtual decorations and shopping,just to name a few categories.Beyond the traditional computing devices,real-time 3D sensing and scene understanding capabilities built using the RealSense cameras are enabling a new class of autonomous machines,including self-navigating robots,collision-avoiding drones,virtual dressing mirrors for retail shopping,etc.RealSense cameras are also enabling immersive augmented and virtual reality experiences by adding accurate tracking,user interactions,and mixed reality technologies.
This book,entitled“RealSense互动开发实战”and authored by“王曰海”,“汤振宇”,and“吴新天”will be a very useful tool for the systems and applications developers.After a high level intr-o-duc-tion of the underlying technologies and application areas,the book dives into the RealSense SDK and APIs.It includes detailed discussion of the software architecture and application implementation techniques,along with numerous examples of sample codes.Following the principles and examples outlined throughout the book,creative developers and engineers will be better equipped to implement RealSense products in a wide range of systems and applications.
The era of perceptual computing is upon us,where the computing devices and machines can sense and perceive the 3D world,interact with humans and each other in natural and engaging manner.We are just at the beginning of a journey to profoundly transform the world of computers and machines.It’s time to unleash the Cambrian explosion of computing enabled by rich real-time 3D sensing and perceptual computing technologies.I can’t wait to see what the future holds,and invite you to join the journey to create the future together!
Achintya K.Bhowmik,Ph.D.
Vice President&General Manager,Perceptual Computing Group
Intel Corporation