TreeON: Reconstructing 3D Tree Point Clouds from Orthophotos and Heightmaps

Angeliki Grammatikaki*1 ORCID Johannes Eschner*1 ORCID Pedro Hermosilla*1 ORCID Oscar Argudo*2 ORCID Manuela Waldner*1 ORCID
*1 TU Wien
*2 Universitat Politècnica de Catalunya
Paper
Paper
Publication
Supplementary
PDF
Code
TreeON
Model code
Google Drive
Pretrained
Model weights
Data generation
Synthetic trees
Scene generation
Rendering code
Data
Synthetic Trees
Dataset (GeoTree3D)
Teaser image showing TreeON results

Abstract

We present TreeON, a novel neural-based framework for reconstructing detailed 3D tree point clouds from sparse top-down geodata, using only a single orthophoto and its corresponding Digital Surface Model (DSM). Our method introduces a new training supervision strategy that combines both geometric supervision and a differentiable shadow and silhouette loss to learn point cloud representations of trees without requiring species labels, procedural rules, detailed terrestrial reconstruction data, or ground laser scan data. To address the lack of ground truth data, we generate a synthetic dataset of point clouds from procedurally modeled trees and train our network on it. Quantitative and qualitative experiments demonstrate better reconstruction quality and partially superior coverage compared to existing methods, as well as strong generalization to real- world data.

Overview of the network architecture and training supervision
Figure 1: Overview of the network architecture and training supervision (green).

Qualitative Results

Qualitative Results on Landmark Trees

Typical Reconstructions Difficult Cases
# DSM Orthophoto Target Output # DSM Orthophoto Target Output
1 1
2 2
3 3
4 4
5 5
Table 1: Qualitative comparison of reconstructed landmark trees against target photograph of the real tree.
Left column: reconstructions.
Right column: difficult cases: (1) DSM underestimates tree size, (2) orthophotos are cluttered by neighboring trees, (3) shadows are weak or missing, and (4) dead tree in the orthophoto. In each case, the model compensates by leveraging the complementary input signal. (5) Failure case: DSM shape and weak ortho cues mislead the model into reconstructing a conifer instead of the deciduous target.

Ablation Study

DSM Ortho Loss Functions
BCE Shadow Silhouettes Shadow+Silh BCE+Shadow BCE+Silh BCE+Shadow+Silh
DSM
Target
Orthophoto
DSM + Ortho
Table 2: Qualitative ablation study of different loss functions (columns) and input modalities (rows). Left: DSM and orthophoto inputs with target photo. Right: Rows show reconstructions from DSM, orthophoto, or both inputs, while columns compare supervision strategies.

Comparison to State-of-the-Art Models

Inputs Ground Truth Networks
DSM Orthophoto OpenLRM
[1]
TMNet
[2]
AtlasNet
[3]
DPMs
[4]
PUGeoNet
[5]
RepKPU
[6]
Ours
Table 3: Visual comparison across baseline models with our approach.

Comparison with high-resolution LiDAR data

DSM Orthophoto LiDAR Ours
1
2
Table 4: Qualitative comparison of our reconstructions with IGN LiDAR and orthophoto data in the French Pyrenees. Point cloud density from aerial LiDAR is 37 and 25 pts/m2, respectively.