In an earlier post I went over how I was using LiDAR to calculate tree height in the Pacific Northwest. Since that post there’s been a lot of new developments on this project. The first thing to note is that, though there was some trepidation over the sheer quantity of data, it turned out to be a needless worry. As mentioned in the earlier post, one tile was processed rather quickly. To do the 100 tiles that covered the whole study area (the Hood Canal watershed and on up to Port Townsend, for those familiar with the area) it could have been a big processing nightmare. Thankfully, the Puget Sound Consortium LiDAR data that I was using was already in DEM format as opposed to the raw point data. That was key since converting the raw point data to a surface grid, is, from what I hear, a very process intensive task.*
So basically there was the task of subtracting the ground surface from the top surface. The ground surface was already mosaicked into a single dataset while the top surface was in tiles. I went ahead and mosaicked the top surface tiles, which did take some time but was not completely onerous. It might’ve been about a 4 – 8 hour time period from starting the mosaic to finishing, though I did not set a timer to record the exact amount and was doing other things in between batches. With that done, the subtraction calculation did not take too much time either.
Now for the more interesting stuff: the results. Here’s an image showing the Lower Dosewallips River. The blue line is the current hydrology layer (you can see it isn’t perfect, of course) and a buffer of that hydrology line 200′ on either side. The tree height results are transparent so that you can see through them to a high-resolution NAIP 2006 image underneath. Dark green is trees of height 100′ – 200′ and light green are trees 30 – 100′.
Now, visible in this picture, a problem with the data became quite apparent to those who know this area: the analysis did not pick up any of the deciduous trees. Since this area is dominated by coniferous forest – which shows up remarkably well when compared to the image – this wouldn’t be a problem except for the fact that these river corridors are one of the major analysis sub-units. And in the river corridors, patches of deciduous trees, such as alder, can be more prominent and are important to measure in this project. The reason for this problem is simple: all of the LiDAR was flown during the winter, i.e., leaf-off, thereby making the deciduous trees more or less disappear.
The great thing is, there is some LiDAR data available for a few of the study area stream corridors that was commissioned separately and taken during the summer, when the deciduous trees should show up. I just got that data a few days ago, processed it for height and used the same symbology as for the other data to show you a comparison in the same area as the last screen shot:
That data is also higher-resolution than the original LiDAR I was working with (3 feet versus 6 feet). You can see that this summer LiDAR was much better at picking up all the trees. There are a few patches of green on the imagery that don’t seem to show in the LiDAR height analysis, but perhaps it isn’t 30′ yet, with only the 30′ being shown as green in this classification.
Unfortunately the leaf-on data isn’t available for the entirety of the study area. This brings us to one of the age-old GIS analyst’s questions: do we cobble together better data with worse data even though it means that the error is highly spatially variable or do we stick with using the worse data for the whole study area since it’ll enable an apples to apples comparison across the landscape? The answer is: it depends. It depends on how the data will ultimately be used. And right now, we don’t know the exact questions we are trying to answer though we do know that some comparisons from basin to basin will be made. So for now this question, for this particular project, can be left unanswered.
*For those in the know, I’d welcome your comments on this.