ADS-B flight trajectory capture attempts

title_zh: ADS-B航迹采集尝试

(English version below)

航班追踪是很好的公民科学项目,刚好也和我的研究专长兴趣非常相符。在本科毕设指导评审和自己研究立项申请中,我都很希望可以延续以往车辆和行人轨迹数据分析的研究,同时与所在单位的教学科研方向实质性地紧密结合。除去理论和方法研究层面的问题,最常遇到的问题要属研究区域选择了:公开数据集通常不能覆盖中国大陆地区——相比于维基和OSM大陆社群,火腿和飞友们的规则意识和社会责任感使得他们自觉不向海外传输和共享数据。研究海外区域的数据则会使得项目申请和学生答辩面临沉重致命的质疑。国内航班信息提供商缺乏方便直接的数据接口,而缺乏网页产品、主打移动应用的策略使得逆向工程难度倍增,更不用说潜在的伦理和法律风险。因而,自行采集数据成为了一条值得尝试的出路。

受到医学项目预实验范式的启发,过去一个月写项目申请期间我决定通过动手采集和观察数据寻找灵感。此前了解到用于收看电视广播的RTL2832U电路和R820T芯片是一种价格和性能都可接受又易于获取的软件无线电解决方案。而各大航班追踪网站都会同时建议搭配一个SBC以最小代价实时接收和共享数据,因而还购入了一款树莓派作为初始设备。当然,后期走通技术路线并对整套系统有基本了解之后,在笔记本上安装虚拟机的方式在不要求持续在线的场景中也完全可行且方便。

按照adsb.im网站提供的说明,我们分别在RPi和VirtualBox中烧录了带有ultrafeeder项目安装的Raspberry OS和DietPi系统镜像。这个项目由Github的sdr-enthusiasts组织维护,使用了德国爱好者wiedehopf改编的后端ADS-B解码工具readsb和前端ADS-B航迹可视化界面tar1090。这个feeder项目支持向诸多平台实时共享数据,同时支持不作任何共享以及在一个feeder实例下接入多个子feeder (称为Stage 2),满足私有数据保护和室内接收房间窗户单一面向性条件下多个天线数据的简单合并的需要。经过反复调整,在自家阳台上采用树莓派作为主机,放在另一侧窗口的旧笔记本上运行的虚拟机作为子节点,以及日常携带的工作笔记本上运行虚拟机作为在校测试的独立移动节点。截止目前,在自家两侧稳定采集将近两周,在广汉校区和天府校区测试采集各一次(共62061条记录),具体统计分析还有待进行。

colored dots plotted over a greyscale terrain map
飞院校园测试采集到的航迹 (颜色表示速度,红色慢、蓝色快)
Flight trajectories collected from tests on the campuses of Flight University (color-coded speed, red for slow, blue for fast)

粗略观察可见,广汉的训练航迹因为三教的遮挡有部分缺失,但双流北向离场倒是意料之外地好;福田能通过不同方向拼凑看到天府西跑道北向和北跑道东向离场、东西两条跑道北向进近略有缺失,机坪南端的滑行也采集到一些;高空部分在五凤溪处四通八达。

在实际调试过程中当然少不了各种问题。比如电视棒硬件实际上质量参差不齐,表现为完全收不到任何有效信号只有噪声,或者难以在指定频率持续稳定工作,似懂非懂地查阅一些资料后认为可能是tuner芯片失效,由于没有相关电路知识和工具,所以也没有拆机深究,而是只能退货重买碰运气。软件上问题虽然不多,但也值得注意。网络环境导致开源工具链中的Debian软件更新、Docker软件获取、Github软件获取较为痛苦,尤其是初始搭建点亮的过程,配置更换镜像源自是不在话下,但除了各种折腾之外采用一些非可靠可信或有登录限制的私人源引入了一些对本任务没有那么重要的供应链安全风险。另外目的不同也使得我们虽然不需要配置共享到平台的内容,但需要更改一条readsb的配置选项READSB_ENABLE_TRACES=true,可以在ultrafeeder网页管理界面右上角的Setup菜单Expert选项(http://rpi.local/expert)通过添加环境变量的方式处理。配置好后以gunzip压缩的json格式轨迹将按照ICAO24位地址16进制表示的末两位组织在默认/opt/adsb/config/ultrafeeder/globe_history目录下的traces文件夹。如果没有配置开启轨迹存盘,则该路径中只会保留用于简单回放和查看大致分布的heatmap文件夹内的ttf格式二进制存档,没有包含完整的信息1,同时文件解析和格式也明确没有保障2

总之有了这些数据,以后研究和指导希望都会有更好的素材和选题。从这些数据样本出发先行开发测试算法,也为后期凭借可展示的初步结果寻求更权威完整的合作数据申请提供支撑。

English version

(Partially ranslated by gemma2 via Ollama, adjusted and reviewed by myself with reference to deepseek-r1:14b via Ollama)

Flight tracking is a very nice project for citizen science, which also aligns with my research expertise and interest. For undergraduate thesis mentoring and my own research proposal, I’ve consistently hoped to continue my work on the analysis of vehicle and pedestrian movement data, also to align it with the teaching and research objectives here at my institution substantially.

Besides theoretical and methodological issues, the most common challenge has been choosing the study area: publicly available datasets often don’t cover mainland China. Unlike Wiki and OSM communities here, local hams and aviation enthusiasts are aware of rules and social responsibility, leading them to consciously refrain from transmitting and sharing data across borders. But focusing a project on overseas regions would lead to serious condemnation during fund applications and student presentations. Domestic flight information providers lack convenient and direct data interfaces, and their focus on mobile applications rather than web interface makes reverse engineering exponentially more difficult, not to mention the potential ethical and legal risks involved. Therefore, collecting data independently emerged as a viable alternative worth exploring.

Inspired by the pilot study paradigm of medical research, I decided to gather and observe data hands-on for inspiration over the past month during my proposal writing. I had previously learned that the RTL2832U circuit and R820T chip are an affordable and readily available software defined radio (SDR) solution for watching television broadcasts. Major flight tracking websites often recommend pairing an SBC with this hardware for minimal cost to receive and share data in real-time, so I purchased a Raspberry Pi as my beginning gear. After understanding basics of how the entire system works, using a virtual machine on my laptop is entirely feasible and convenient for scenarios that don’t require constant online feed.

Following the instructions provided on adsb.im, we flashed Raspberry OS and DietPi images with the ultrafeeder installation onto both the RPi and VirtualBox. This feeder project, maintained by the sdr-enthusiasts organization on Github, integrate a modified backend ADS-B decoding tool readsb and a frontend ADS-B flight path visualization interface tar1090 by a German enthusiast wiedehopf. The feeder supports real-time data sharing to various platforms, while also supporting no sharing at all and incorporating multiple micro-feeders (for a so-called Stage 2 configuration) under a single feeder instance. This meets the needs of private data collection and merging data from multiple antennas in my indoor receiving scenario with windows facing only one direction each.

After repeated adjustments, I set up a network with the Raspberry Pi on my balcony as the main device, a virtual machine running on an old laptop by the other window as a micro-feeder node, and another virtual machine running on my work laptop for testing as a standalone mobile node while on campus. As of now, there has been approximately two weeks of stable collection from home, and test collections once each at Guanghan and Tianfu campuses, with a total of 62061 records collected from campus tests. Detailed stats and analysis is still pending. Initial observations (figure above) show that training flights at Guanghan (GHN/ZUGH) were partially missing due to obstructions by Teaching Building 3, but northbound departures from Shuangliu (CTU/ZUUU) were surprisingly good. At Futian, merging views from multiple spots covered northbound departures from the west runway and eastbound departures from the north runway in Tianfu (TFU/ZUTF), while there are some gaps in northbound approaches for both west and east runways. Some data of taxiing were captured near the southern end of the apron. For high altitudes, observations seemed abundant for flights over Wufengxi VOR/DME (WFX).

Not surprisingly, various issues arose during the whole process. For example, the TV tuner hardware quality can be inconsistent, resulting in no effective signal and only noise, or difficulty maintaining stable operation at designated frequencies. After vaguely researching some materials, I suspected it might be due to a faulty tuner chip. However, lacking relevant circuit knowledge and tools, I couldn’t disassemble and investigate further, so I simply returned defective units and bought from other vendors hoping for better luck. Software-related issues were fewer but still noteworthy for beginners. The network environment made updating and retrieving software from Debian, Docker, and Github in the open-source toolchain quite painful, especially during the initial setup process. While switching to mirror sources was necessary, resorting to some unreliable or restricted private mirrors introduced potential supply chain security risks though it’s less relevant to this task.

Additionally, we had to manually enable track logging feature of readsb for data recording. This involves adding an environment variable READSB_ENABLE_TRACES=true, which could be handled through the Expert mode under Setup menu in the upper right corner of ultrafeeder web interface (accessible at http://rpi.local/expert). After configuring, flight trajectory data in gunzipped JSON format would be found under the traces folder under /opt/adsb/config/ultrafeeder/globe_history, organized by the last two hexadecimal digits of the ICAO 24-bit address. If tracking was not enabled, only the heatmap folder within that path would retain binary .ttf format archive files for simple playback and viewing (wiedehopf/readsb#23), causing loss of full information. Both parsing and formatting are known to have no good support or guarantees at this time (wiedehopf/tar1090#221).

In conclusion, with these data collected, there should be better material and topics for future research and mentorship. Using these sample datasets to develop and test algorithms first will also generating presentable results to support the seek for more collaboration and data from authoritative and integrate sources later on.

  1. 作者本人回帖 wiedehopf/readsb#23 “pTracks (and the 45 min history on non globe-index installs) uses the data produced by tar1090.sh in that repo. It basically builds its own history using aircraft.json and archiving a reduced data set. This is the origin of my tar1090 webinterface. At some point i’d like to move that generation of data to readsb but it’s not been a priority as it works perfectly fine.” ↩︎
  2. 作者本人回帖 wiedehopf/tar1090#221 “They are binary, i might change the format without notice, if you want your own archives do that. Or parse the json traces that can be put out, they have much more info than the base data for replay.” ↩︎