While video was once an alternative communication option, over the last few years, it’s become expected by consumers. We are in the middle of a digital communication transformation that has only been accelerated by the COVID-19 pandemic. With 96% of employers in our recent State of Customer Engagement report acknowledging an acceleration, and 78% of respondents reporting that they use video communication with their customers, it won’t be long before every organization uses video in some way.
Video communications are quickly becoming the norm and we can clearly see why. Video increases empathy, and face-to-face communication creates stronger connections. Connection is often the catalyst to turn regular customers into happy, paying customers.
What will it take for your organization to build a memorable video experience? Arriving at your answer will require deep consideration and understanding of the application you seek to build:
What functionality will your video app need?
Do you want to build an embedded, customized experience?
Do you need access to developer resources and tooling?
Do you want to build an omnichannel engagement solution?
Will you need to build your own solution? Or can you leverage a SaaS/API offering?
To help with this seemingly overwhelming task, we’ve created an ultimate checklist of everything you need to consider when productizing video in your app. Read the full checklist, or use one of the links below to jump to a specific section.
Not all APIs or SaaS offerings can provide the integration and customization that your application may require. It’s important to define your application’s non-negotiable requirements; from there, you’ll be better positioned to evaluate which APIs and SDKs can fit your build specifications.
Twilio has backend SDKs (C#, Java, Node.js, PHP, Python, Ruby) as well as SDKs for multiple platforms (web, iOS, Android). It also provides quickstart guides that demonstrate how to build a video application using its Programmable Video offering, which provides an implementation of a UI.
Twilio’s Guide for the Impatient contains a grid of the various different types of interactions available and the recommended settings for them across desktop browsers, mobile browsers, and mobile SDKs (such as apps).
Be aware of where your SaaS provider will store the recordings. Typically, the provider themselves will host the video and allow access to it programmatically. However, Twilio offers the ability to store recordings directly into AWS S3.
Twilio utilizes an approach where all communication streams traverse encrypted channels. They are encrypted at rest but may need to be decrypted in memory for composition or transcoding the video for quality purposes.
Twilio also allows for configuring encryption keys so that content is always stored encrypted. Encrypted content can only be decrypted by the holder of encryption keys: you. For security, and especially if your video chat is to be integrated into a telehealth offering, this strategy for encryption may be appropriate for the needs of your application.
Users have been exposed to new and innovative tools that work in conjunction with video chat. Whiteboards for group collaboration, screen sharing, and text chat are just some of the add-ons that users are accustomed to seeing with their video chat applications. Some providers offer integrations for these collaboration add-ons. For example, Twilio’s DataTrack API and Conversations API, or pre-designed layout APIs from Jitsi can help you move forward with these add-ons.
The end user platform refers to how your customers will use your application. Will it be through the web (desktop computer and browser), mobile (native application or mobile browser-based), or both?
The choice you make here impacts different aspects of your project and timeline. Whichever end-user platform(s) you choose, make sure your development platform supports your application needs with SDKs that work with the languages used by your engineering teams.
With any type of web application—video chat notwithstanding—you’ll need to work through the basics of user account registration, login, and authentication. Your implementation of authentication and authorization is closely tied to your system’s security.
Most video platforms handle authentication similarly, requiring a server-side component that generates a token, which is then consumed by a client when connecting to a session. The difference between platforms lies in how server-side token generation works with secrets. Twilio, for example, uses an API Key Secret. In addition, Twilio provides subaccounts and different key levels for more granular authorization and permissions control.
Your business needs will dictate the different kinds of video interaction that your application will offer. In some cases (for example, a customer service interaction), a 1:1 video chat is usually sufficient. In other cases (for example, a focus group session), group chat is necessary. If your application facilitates group video chat, will it be a single presenter speaking to an audience, or will there be roundtable discussions and group interaction?
Consider whether your application needs to support end-to-end encryption. Keep in mind that you cannot offer full end-to-end encryption and record on the server simultaneously, since connections between users and the server must be decrypted in memory in order to save to disk.
Your organization may deal with protected health information (PHI) or be subject to HIPAA regulations. In that case, Twilio will sign a Business Associate Addendum (BAA) to ensure that you can build a HIPAA-compliant application. To learn more about how to build a HIPAA compliant workflow using Twilio’s offerings, please refer to Architecting for HIPAA on Twilio.
Developing high-quality video applications is critical to creating memorable experiences. Twilio provides many different APIs to help provide the experience you need to deliver.
Twilio’s Network Bandwidth Profile API abstracts away the heavy lifting required to use resources more efficiently, giving higher priority to those video tracks that are more important. You can specify render dimensions, maximum bitrate, and maximum number of video tracks, thereby giving users an optimal video experience.
In addition, Twilio’s Dominant Speaker Detection API, Track Priority API, and Network Quality API all contribute to providing the perfect balance between optimal resource usage and optimal user experience. Plus, Twilio’s Video Insights gives you access to usage and quality metrics across all your rooms and participants, so you can observe your application, discover trends, detect issues, and troubleshoot.
New noise cancellation technologies are emerging, driving us further away from the once underwhelming video experience of many apps. While operating systems, browsers, and video chat platforms themselves all include different layers of noise/echo cancellation, there are ways to utilize machine learning to build an unparalleled aural experience. Twilio offers AI-based Noise Cancellation for all Video Group Rooms customers so you can provide best-in-class audio experiences directly in your video application.
Regarding layout, you may build a custom layout based on the needs and comfort levels of your users.
Your video use case and communication style will inform your design. For example, the layout and usage of screen real estate varies between 1:1 video, group chats, and sessions with a presenter and screen share. A layout and implementation that focuses on the active speaker also contributes toward a memorable video experience for your users.
Your users’ devices also impact video layout choices. Connecting from a web browser on a desktop computer may dictate a fixed layout, whereas connecting from a web browser on a tablet may need to support screen rotations and different layouts. When users connect from mobile devices with smaller screens, your options for how to use screen real estate become even more limited.
Only by tracking customer service scores (CSAT) can you measure whether a customer feels their needs have been fulfilled. CSAT surveys can be administered via SMS or email or by asking for ratings after interactions with customer support.
An NPS survey is different; rather than asking a customer to evaluate their current feelings toward the company or product, it is focused on the future—would they recommend you to their friends or colleagues?
Your system’s metrics, logs, and traces provide a wealth of data that must be used if you are to gain insight into usage patterns, issues, and opportunities for improvement. However, the sheer amount of data available can be overwhelming unless you have tools to help you capture, filter through, and understand that data.
When an event (for example, a user’s sign on to a video chat session) has a ripple effect from the client application to the server and across a network of distributed resources, it’s important to associate all of those ripples with the originating event.
With your metrics, logs, and traces all in a centralized location, your next task is to build visualizations to help your team make sense of the data. Dashboards help you to see cross-referenced resources consumption and usage patterns. Whether you need to see the number of active users, the different types of video sessions currently running, platforms and devices used, or even core resource consumption—dashboards will be your go-to tool for finding signals amidst the noise.
Luckily, Twilio’s Video Insights allows you to provide analytics and aggregations for observing your application, discovering trends, and troubleshooting rooms and participants. Video Insights is free for Group, Peer-to-Peer, and WebRTC Go Room developers and is available for any applications built with Twilio Video’s Javascript, iOS, and/or Android SDKs.
Particularly for video chat applications, you can take advantage of webhooks to collect data on key user events like room creation, session joining, or recording failures.
“
“Video Insights has been a much-welcomed troubleshooting tool and ultimately, has equipped us with the data needed to more efficiently help our users improve their telehealth experience. In particular, Video Insights has become instrumental in our investigation of quality issues. By utilizing the threshold data that Twilio provides to identify a stable and healthy connection, we're able to quickly identify the source of the issue and assist the customer in taking precise adjustments to improve their connection and overall experience with the platform.”
Brennan Fahselt,
Technical Product Specialist Lead
Disk storage is unlimited—as long as your budget is unlimited. Of course, the rest of us need to be judicious about our recording storage strategy!
Earlier in our checklist, we covered some of the basic numbers regarding necessary storage capacity. Some host-based providers offer a storage limit per host. Purchasing additional storage or incurring overage fees can quickly bottom out a budget. Twilio, on the other hand, charges a low flat rate of GB stored per day.
Your team would be wise to implement a plan for time-based auto-deletion or cold storage archiving. Doing so will help manage storage needs and costs.
Businesses that leverage group room technology often need to facilitate meetings or conferences. Group learning environments—such as in educational institutions (like LearnCube), fitness programs (like Barry’s), or training centers—also utilize group rooms for video rather than 1:1.
The dynamic of a group room video session brings unique concerns which your platform will need to address. For example, an active speaker indicator and a presenter’s ability to mute all participants or disable participant cameras are must-haves.
Participants in a group room expect chat functionality that runs in parallel to video. You’ll need to think through messaging (to everyone versus to specific individuals) and other aspects such as rich content or emoji support.
In addition, you will need to consider the layout for supporting a room with dozens (or hundreds) of participants. Should there be pagination, customizable views based on user selection, or limits?
With multiple session participants added into the mix, noise and echo will be an issue. To mitigate unwanted noise, the Twilio Video SDK supports Acoustic Echo Cancellation (AEC) and Noise Suppression (NS).
Lastly, large groups often have the need for breakout rooms to accommodate smaller groups. It will be important to work through both the design of the user experience flow and the technical aspects of one group room session spawning several others.
In contrast, business contexts such as telehealth, document e-signing, or customer service make use of 1:1 video sessions rather than group rooms. Applications facilitating social interactions also typically leverage 1:1 video. While 1:1 sessions may seem simpler, there are still several considerations to keep in mind.
Many business use cases will additionally require features like screen sharing, whiteboarding, or other real time collaboration niceties. File sharing—being able to send attachments to one another from within the video room— will also be considered a must-have for many contexts.
Other feature considerations include Virtual Backgrounds backgrounds , integrating augmented reality (AR) technology, and recording or transcription services.
Regardless of the business context for your video platform—whether you need to focus on supporting group rooms or 1:1 sessions—the Twilio Video platform provides a wide array of tools for easy integration of the features needed to match your use case.
COVID-19 has accelerated digital communication strategy by six years. For your apps to remain relevant, integrating video is vital. Keeping the above items in mind while you plan for your app will help your team build the correct solution and create the best video communication experience for your users.
We hope that this checklist makes it easier to plan your video integration. As you move forward with an API/SaaS offering to bolster your platform, understanding the functionality required to build a fantastic experience is key to creating a delightful video chat app. Having a complete understanding of the scope of the problem is fundamental to success.