External Understanding: Dissecting APIs inside of IoT devices (Part 1)

Introduction

As the world of IoT evolves, so does the security within this realm, like most fields. One of the more popular brands for being quite hidden from the public is Apple. Despite Apple having their systems compromised hundreds of times, protocols reversed, and source code leaked, they still seem to be much more frustrating to work with due to the limited knowledge of their custom implementations of specific protocols. In this article, we will discuss the internals of IoT devices, specifically looking at Apple TV. For context, the Apple TV is a device manufactured by Apple as a smart home hub that allows you to easily play movies, shows, or any form of media while also being Apple to start interactive screen share/mirroring sessions. Once a brief introduction is done on how Apple TV works, we will then go into exploring the protocols, services, and other various systems that the device implements then forward into accessing rather than “abusing “ the endpoints to those services. For this article, you will need quite a few things: Wireshark so we can better understand the protocols; the go compiler so we can automate some of these processes; base Linux tools like hd (hex-dump ), cURL, and the avahi-browse tool; finally, nmap so we can better scan and figure out what this device has on it. We will use these tools to better inspect protocols, understand files, work with requests, and build a program to automate the requests. Further on in the article, we also use a toolkit known as plistutils, which allows us to dissect PLIST files, but it is not used nearly as deeply as the other tools are.

A basis for this article

Before we dive deeper into this article and start exploring the deep, dark sides of the internals of an Apple TV, it is important to chomp down some terms!

DAAP → Otherwise known as Digital Audio Access Protocol, which also has the name of DMAP, known as Digital Media Access Protocol, is a protocol used by Apple to stream media such as videos, photos, and audio over a network.
mDNS → Otherwise known as Multicast Domain Name System is a protocol that is frequently thrown into the name of bonjour or zeroconf that allows devices on a network to discover each other and properly communicate without a need for a centralized server.
HD → HexDump, a tool on Linux that allows you to dump files as a hex stream.
avahi-utils → A toolkit that gives you access to programs that are designed to work with mDNS and DNS-SD protocols for Linux.
PLIST/BPLIST → Property List (PLIST) and Binary Property List (BPLIST) are file formats commonly used by Apple to store and exchange data between applications. PLIST is a human-readable XML-based format that stores data in the form of key-value pairs with various data types. On the other hand, BPLIST is a binary format that serves as a more compact and efficient alternative to PLIST. BPLIST contains the same information as PLIST but in a “compiled” binary form. Both formats can be decoded and read by applications to retrieve the data they store. Currently, Apple will use BPLIST more commonly than regular PLIST files because BPLIST files are much easier to compress and work with, and much lighter to send over a network than a regular PLIST file. These files both have their pros and cons, but BPLIST will be seen more frequently. It is important to note that despite the pros and the cons of PLIST and BPLIST files, the type of file chosen also depends on the situation and is not fully dependent on the benefits of each.
ECP → Commonly known as External Control Protocol. ECP is a type of classification of protocols and implementation that allows systems to work with other systems internally, and externally. Let us put this in perspective so you can understand it better. Say you have two hosts, 10.0.0.19 (H1) and 10.0.0.20 (H2). H1 (Host 1) wants to query something from the internal system of H2 (Host 2) but does not have physical access to the device or a full remote session in the store. To do this, H2 uses ECP (External Control Protocol) to allow hosts like H1 to remotely access and extract this data from H2 without creating a session. ECP is typically used by devices so they can properly work with each other to feed information back and forth for session-based connections or just for general host information, which allows third-party applications to work much easier! ECP is also used in the context of AV systems, home automation, and other control systems, where third-party devices need to communicate with and control different types of equipment! Exciting :D

We will be using these terms frequently within this article, especially when it comes to deeper specifications.

Note: Before starting the article, I wanted to make this clear, when I make a request to an endpoint or even demonstrate responses, the IP address will not stay the same, this is just the address used to represent these requests on MY OWN Apple TV device. Please ensure that before using anyone else’s device, you have permission to test certain frameworks or demonstrations used in this article.

Security Research | Digging into APIs on Apple

This is the primary section of this article that will be split up into individual sub-sections and different notes, tips, or instructions. This article was also expected to go a bit deeper, so I felt that it was only best to split it up into sub-sections and have each sub-section filled with something unique. For a general understanding, this article will specifically target the APIs themselves, which includes AirPlay and DAAP/DMAP, while also slightly targeting other systems within the device itself. If you are wondering, the goal of this article is to just dig a little bit into how to do some base research and get an understanding of how a specific system (in our case, an AppleTV) works from the outside. We are making it our goal to fully understand how the device works because before we can go out there and start trying to exploit the device and the software, or reverse engineer the protocol, you should at least have good knowledge as to how the system itself works. AppleTV was chosen for this because Apple has some very complex mechanisms put in place to protect their devices, despite having ECP-based APIs and systems put in place. It is also good because it can extend our skills; when it comes to Apple it seems especially when it comes to specific devices like the AppleTV that there is not much understanding of the device that is public (possibly because Apple may threaten lawsuits if people do not take specific content down, which companies have been known to do for quite some time) which can expand our critical thinking. Below I have provided information on the device’s specs so we can better work with what this research worked with.

AppleTV Software Version → 7.9
Source Version → 220.68

It might also be worth noting that for this research and article, I am working on ParrotOS with the tools and proper programs pre-installed and will be running commands directly from the Parrot Terminal.

Section #1 → Exploring the device

Apple TV is known to make some nice devices that have some cool and modern interfaces while also having one of the better security systems. Because of this, it makes exploring the device quite annoying when it comes to external exploring. What I mean by “externally exploring” is exploring how the device works, how it calls work, and its protocols, and services work without having direct physical access to the device. In this article, we will be exploring every bit of the device and how everything works from the outside using base utilities and tools to better understand it. It is important to note that the prime tool in our research will be Google even if it is a pain to get anything out of Google. Alright, so what is our scenario and how will we start this exploration? Well, we have an Apple TV and do not have any other desire but to better explore it and figure out what is going on. The first thing that should be thought about is exploring what the device is; we can do that simply by using nmap and first starting a scan on the device. Before we run the command, we would, or should, be running Wireshark. To do this, if you have Wireshark already installed, you can just type wireshark in your terminal and you should see Wireshark pop up. For this tutorial, I will also be filtering traffic (for now) with a filter specific to the source and destination IP address of the device I am looking to view. My window currently is shown in the screenshot below with the filter active.

Now that we have Wireshark open and running, we can go ahead and start to run nmap. There are multiple things we want to do here; we want to discover services like any regular port scan but we also want to view information on these services and be able to view information about the device. The command I formulated for this will be the following:

Dissecting this we have

-p: This flag helps us determine the port range, we do not necessarily need to scan every single port on the device, but I did this for general use cases and examples.
-SS: This flag helps us tell nmap the type of scan to run; in this case, we used a TCP SYN scan which is a stealthy scan method.
-O: We want to be able to find the operating system of the device we are targeting, for our sense this can be Google-able but it is good practice.
-A: This flag tells nmap to enable much more aggressive scanning, which can bring in new scanning techniques such as script scanning, version detection, and other methods or techniques used for service discovery.
-v: This tells nmap to produce a more verbose output that can help us determine any errors that happen within the scan.
-T4: Sets a template for the scan, in our case this determines how quickly nmap will send packets to the system.
-oA: Will specify the output format and filename for the scan results, we do not need these results but in my case, they worked perfectly.

When we run the command and let it finish, we will see a decently sized output shown in the screenshot below!

Note: The scan I ran was much slower and took so much more time due to the number of ports I was scanning, which really can be unnecessary sometimes, especially if you know what to look for.

Looking at this scan, we can see some pretty unique information mostly service related; if we filter out the junk and verbose output, we can see this list here.

Cool, we have discovered some unique ports, but all of them seem to be related to some form of proprietary software that Apple has developed, such as Apple AirPlay HTTPD, AirTunes, Apple iTunes DAAP, etc. To fully go further, let us go ahead and just make a list of data we have or rather a table.

We have some pretty interesting services that were shown along with their versions. The main ones we will be targeting and inspecting will be the following:

Apple iTunes DAAP 11.1b37 and AirTunes RTSPD 220.68 and will take a smaller peak at the HTTP server running on port 7100. These are targets we want to look at because they can help us collect specific information on the device, understand how the protocols work and make a sub decent conclusion that can spawn more curious questions. Before we dig into what these protocols are doing, it might be in our interest to check out logs and see what we got out of this server or if there was anything we can work with. If we do some quick googling mixed with looking at the NMAP scan, we can already infer what kind of protocols these servers use, which can be listed as the following:

These are some base protocols as I am 100% sure there are others we cannot figure out at the moment of research, which we might find later on. Our Wireshark window should look something like:

This is A LOT of packets to filter through, so we can filter out by protocols. We can now make a filter, which is shown below!

ip.src == '10.0.0.96' || ip.dst == '10.0.0.96' && http || daap || mdns

When we finish this filter and let Wireshark re-filter all the packets, we may see more HTTP packets.

Note that right now we are not looking at TCP data; when looking for full dissection, it might be relevant, but for the scope of this article, it is not too revolved around direct dissection and inspection of everything since we already know what we want to dissect. When we scroll down, we only see a bunch of HTTP requests made from Nmap. We can see that Nmap has made these requests since when we inspect the packet, we get a little user agent that ends in NSE (Nmap Scripting Engine), which can give us an indication it was from a script that NMAP loaded. This example is shown below when inspecting an HTTP get request with a file path named browseDirectory.jsp.

Woah, we missed something! At the very top of our scan, there are two unique packets, one which is an MDNS packet and another one which is a response from the DAAP server! Let’s first look at the mDNS packet.

In this data, we see something pretty useful, a multicast domain name to query! We see the following result:

Before we move on working with the DAAP server, let's see if we can query this with avahi!

Working with avahi-utils to query mDNS

The services on the Apple TV device need to be discovered and need to be able to communicate with the systems that want to connect or create sessions with the device’s servers! As shown above, we have discovered an mDNS packet that was sent from the Apple TV to another host where its value was the following brick.

To query this, we can use the avahi utility called browse that runs with the command avahi-browse, which will allow us to browse or locate and list specific services, also allowing you to gather information about available services, including their name, type, protocol, IP address, port number, and other metadata. In our case, the avahi-browse tool will allow us to identify the TV and better understand the queries it is making!

Before we can start querying data, we need to make sure the service is starting or running, we can do this by running:

systemctl status avahi-daemon.service

which will produce an output like:

In some cases like mine, you may get an error saying that it is already running and failed to start due to so and so reasons, either due to it already running or the service being disabled. If the service is disabled, you can use systemctl to enable the service, you might also want to ensure avahi-daemon.socket is also enabled. To disable the service and its socket, run the following commands:

Then, to check if it is still running, we can try using the check again to check if the service is either still failing to load or still running. If you get the same message, try using the pgrep command to see if the process is still running.

If there is no output, then the service most likely is not running, so we can start the service back up.

Once done, we can continue to use the avahi utility and check if it is still running.

Our socket and service are enabled, which is nice to see, however, it still says it is failing. If you are still having issues, in my case with Avahi and its daemon failing to load or start, but saying it’s running, then this can be multiple issues. There can be some miscommunication between the manager and systemctl, obviously, the daemon is still running. Some other issues may be critical causing this error, however, if we really want to go deeper then we can check the journal CTL to check for the service like so:

From here, it will be hard to troubleshoot but you can check to see if avahi works by running the avahi-browse tool to just do base discovery. In our case, and for this article, let’s just run a basic scan to see if it works. To do so, we will run the following command:

These options work like so:

-a: tells avahi to show all services regardless of the type
-t: tells avahi to terminate after dumping a more or less complete list
-v: tells avahi to enter verbose mode
-r: tells avahi to resolve all the services that were found

When we run this tool, if it is working properly, we should see something like the following screenshot below:

The fact that we can run the tool shows that the daemon is successfully working and has been loaded! Now that we know the tool works, we can actually see and grab more information from Apple TV and see if there are any names. When we use Avahi to get the information, we will need to filter the output through grep due to how big the list can be (depending on your network) because the number of devices on your network will typically change the output. For example, if you have a good chunk of devices on that network that use mDNS, then we will most likely see a larger output and vice versa. This is because avahi-browse returns information for all services being advertised on the network. So, there are multiple ways we can filter data. Below is a list of methods:

Query by version: When querying some devices, especially those like AppleTV where we know the version of the systems and services on the device, we can throw the output into grep to check for version numbers.
Device Name: If you know the name of the device, we can also filter it through; this can be common names like Living Room AppleTV or AppleTV, some names might be changed since they were bought which means we may need to do extra discovery to do so! But either or we can still take the output of avahi-browse and throw it into grep with a filter for the device’s name.

For our situation, since we know it is an Apple TV, we will be trying the second method and checking for labels with “AppleTV” in them. The following command and screenshot show the results of running the command.

avahi-browse -avtr | grep AppleTV

It seems that we have so many records shown here but they all seem to have a similar name result! The following tags can be dissected like the following (from the TXT data).

VV → This tag is the version of the mDNS responder protocol, in our case it’s 2.
osvers → Represents the system operating version which is 8.8.4
srcversion → source code version of the device's software, which as defined with nmap is 220.68
PI → persistent identifier, this provides a unique identifier for the device that is persistent across reboots.
PK → public key that provides the public key of the device.
model → This is the model identifier that provides the model identifier for the device which is AppleTV-3,2
flags → flags define the device’s flags that provide information about the capabilities of the device (we see this later on). In our case, the value is 0x44, which specifies AirPlay support.
Features → defines the device features with a key result, in our case the feature 0x5A7FFFF7 indicates support for various media types and the code 0x1E indicates support for the AirPlay protocol.
deviceId → The deviceId represents the device MAC address.
atCV → This key provides the AirPlay compatibility version.
DvSv → Device Service Version that provides the version of the device’s services.
DvTy → Device Type defines the type of device, which in this case is an AppleTV.
CtlN → This tag represents the control name, which is the name of the device (in our case) that appears in the AirPlay menu.
DbId → Otherwise known as the Database ID is a unique identifier for the database ID (keep this info for later)
atSV → AirPlay Service Version is the tag that provides the version of the AirPlay service running on the device.
txtvers → This tag represents the text record version that is the version of the text record being used; in our case, the value is the first version (1).

Wow, we got quite a load of information about the device just from checking its records! You may be asking why this is important, well if you do not know, querying for this kind of data is important for two reasons.

The process of developing an exploit or further research: When building an exploit, or even further investigating a device, it is important to know the version numbers of the services running and to verify if one specific tool is right. Sure, NMAP is a very reliable tool but sometimes it can, like any program, be faulty if it is not run correctly so being able to double-check this data is important! In the case of further investigation, it can help us find specific design documents or even google dork for more related files!
Understanding how the device works further: Not only does this information help when looking for information online with specific details but it also helps us work.

We got some decent information out of this but it may be important to list and try some other packets, like the next frame further down in the scan shown here.

When inspecting the highlighted packet, we get many more queries, shown below.

As you can see from the image above, there is much more information and many more records that are shown here. Now, what if we wanted to really go through every record and possible mDNS? We could – it might actually be helpful if they are reachable or can be seen! A good positive side to looking at mDNS is that we can always stumble upon open API endpoints that we can access on specific services and ports to extract information from without actually having to craft or send specific sets of data to get that type of information. The recon on this side really can benefit the research, however, it may be quite time-consuming!

Understanding the DAAP protocol

The DAAP (Digital Audio Access) protocol, which also has the names DMAP (Digital Media Access Protocol) and DACP (Digital Audio Control Protocol), is a protocol that is used and was released by Apple with the iTunes service. The DAAP protocol allows for media sharing such as audio and video forms of media through RTSP (Real Time Streaming Protocol). This protocol, like most of the other protocols on this list, can be accessed externally due to the implementation of ECP. Because of this, there are specific endpoints and packets that we can inspect that we may be able to get more information on. If we look back at the services that were mapped out, we also recall seeing that port 3689 was open:

We also recall seeing a packet that was the DAAP server response, shown in the image below.

When looking at this data, we notice a few things here - let us dissect the HTTP data. When looking at the HTTP response from the server, we see an HTTP forbidden error - okay, cool, we now have various endpoints and can see something interesting. Looking further into the packet, we can also see the DAAP server version and the type of file, or rather content, that was given as a response. Some base information is shown, then we have the DAAP protocol tag which was shown to be unknown with a size of 24. This is some decent information but what happens when we inspect the request?

When we inspect the request, we see the following:

This actually helps a lot! We now have the path of request URI of the request; if we visit this in our browser, we get the following information. We will be using curl with the verbose option on to see what is happening. When we run the command

we get the following output.

This output is telling us there was not a regular HTML, JSON, TEXT, XML, PLIST, etc., response and, as we saw in Wireshark, there was not a text response but rather a tag-based response, which, in our case, means the server was sending x-dmap-tagged responses (which we can see in the screenshot above). So, if we fix out the command to

When we run this command, we get the same output as before, excluding the warnings. When we open the output file with HD, we get the following output:

This is actually some good information to study. Cool, so we have some random hex dump of a random response we know nothing about. We could Google this, however, googling this is quite hard for our case because Apple is a big boy that does not like to be talked about, so most resources about the protocol itself even are pretty much ripped down, which means we have to figure this out ourselves. So, let's start mapping out the values, we can obviously (with base knowledge) tell that the DAAP server had an error processing the request because of MERR, where ERR typically stands for error and M typically means message, which, in our case, can be ERR MESSAGE. We can also infer that the MSTT, based on knowledge of most server tag-based responses, is a status code that means the status code of the server was returned, which leaves only one thing to infer - MERS most likely means media error. Now we can further assume this and just move on with our day or we can try to replicate this response on the DAAP server, continuing to work with the server and try to figure out if this code is universal to DAAP and better confirm from there. But how exactly will we replicate this? Well, we can do some quick Google dorking to see if we can find anything. So, let us run the following search query:

unofficial documentation for the AirPlay DAAP protocol inurl iTunes

and wow, we actually get a result for some documentation shown below:

When we click the link, we are taken to this page:

This page is actually pretty cool because it is part of a tool and client library for Apple TV and is like an open-sourced implementation. This is a gold mine for us because it helps us further in our research! Scrolling down, we see the following information:

This information shown is everything we need to do! And now all we have to do is some manual parsing and figure out what data holds what and what value holds what value. Tags all hold their own values and responses, which is thought-provoking because it is a pretty interesting way to respond with codes and messages and makes it hard to decode without the right information. We also notice that we can test our theory to see if this is more stuck to the server because they also provide URLs and paths! When we look down further, we may also be led up to some more unofficial documentation written by someone who was able to properly explain all of the concepts and dig into the endpoints and how they work. Taking a look at the documentation, we do not really find much as most of it requires specialized requests, such as PLIST files or other files that need work, so we have to go a bit older and go based on some other documentation that was also googled. The one that caught my interest was the domain daap.sourceforge.net, which gives some cool base information.

When we dig deeper, we end up finding some decent endpoints that we can work with to test our theory.

Now there are a few things we can do; we can write a program to automate these requests or feed through them individually. Both methods are tasking to copy and paste each number and each possible parameter inside of it, so for this method, all we are going to do is take the database URL and try one of them with a 1.

http://10.0.0.96:3689/databases/1/containers/1/items

I currently expect this to time out and spit out errors, the reason being that we can’t really target AirPlay services fully if there is not a current session running, so maybe you can try this more with a bit of a different approach where a session is running. It might also be important to note that Apple has a whole parameter set to access these endpoints, which are mostly remote-id’s, session keys, tags, and other various sets of data before accessing. When we run a normal HTTP GET request, we get the following data and output file:

As expected, we got the same codes! But now what are the codes? Well, the following table describes the codes based on the first website we came across.

Cool, now we have a decent understanding of what these codes are and what they do. If you notice, in one of the descriptions, I wrote a new section that will now be our secondary section here. This talks about how we get the values and better dissect everything.

Section 1.5 (Dissecting and decoding the bytes)

In the table above, in the second row, which talked about the MSTT code from the DAAP server, I mentioned that this was an HTTP 403 forbidden error. How do we know this? Well, it is all in the conversion of the bytes and the order of the bytes you place them in. When we look at the hex dump again it looks like the following image.

After the first eight bytes of MSTT

0x6d, 0x73, 0x74, 0x74, 0x00, 0x00, 0x00, 0x04

We can take the next four bytes and determine the status code; the next four bytes are the following:

0x00, 0x00, 0x01, 0x93

We can convert this in a multitude of ways but first, let us go through it. When we work with the code or four-byte hexadecimal order of 0x00, 0x00, 0x01, and 0x93, we can convert this to a decimal representation by first converting it to base10. Given that the byte order is already in a network byte order, we can simply convert the hexadecimal sequence directly to a decimal using any hexadecimal converter. When we do convert the values of 0x00000193 (the base 10 representation of the four bytes after MSTT), we get 403. Concluding this we can say the four-byte hexadecimal sequence found after the definition of MSTT is the 403 HTTP forbidden error code.

Remember how in the previous section I mentioned the device’s database ID that was queried in the mDNS response? Well, this is where this comes into play. Each device that hosts DAAP will have a unique database ID that is assigned to it. This database ID can be used to access the endpoints which contain data within the application. There is a reason each database is given a unique ID though, that reason. If you have two TVs, for example, in the same area or on the same exact network that has a device that needs to access them externally, those unique database IDs allow the external device to access the proper database with the proper host of that. If the database IDs were not unique, the two devices could get confused when querying data to each other, same for the external host. In this case, our external host is the host that is a device with a third-party application while the two devices are two separate Apple TV devices.

Conclusion and Summary for Part 1

I know this was quite a weird place to end it, but the article just got so big that I really could not do it in one part due to specific needs and readers time, of course! So, following that let’s first go over what we know so far. We started to get down a good base and understanding of what security research is, base fundamentals of how it should be conducted and base understandings of systems and research. We then learned how to use specific tool sets to better expand our knowledge and even do a few Google tricks here and there to get the responses or proper documentation. Concluding this article and part, it is important to know that security research is one of the more in-depth things that a cyber security expert could encounter. When working to research devices, protocols, systems, etc., it is important that we do as much research as possible on the system. It seems as if the cyber security world is so focused on exploiting the system, we often forget about the most important part that goes into exploitation, which includes researching the entire system, dissecting backend protocols, digging into APIs and doing external research to not only train your logical processing but even better understand the system you are exploiting! I hope this article aided you in some shape or form and if it did, be sure to check out part 2!

External Understanding: Dissecting APIs inside of IoT devices (Part 1)

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List