lidar_label_builder.lidar_label_builder

A collection of functions that imports raster and polygon data, creates random subareas, loops through these subareas prompting the user to label the area based on a list of input label options, and adds the result into a geodatabase. Its intended use is to facilitate the creation of training databases for convolutional neural networks.

   1"""
   2A collection of functions that imports raster and polygon data, creates random subareas, loops through these subareas prompting 
   3the user to label the area based on a list of input label options, and adds the result into a geodatabase. Its intended use is to facilitate
   4the creation of training databases for convolutional neural networks. 
   5    
   6"""
   7import os
   8import sys
   9import geopandas as gpd
  10import numpy as np
  11from shapely.geometry import Polygon
  12import matplotlib.pyplot as plt
  13from matplotlib.widgets import RadioButtons, Button
  14from osgeo import gdal, osr
  15import warnings
  16from PIL import Image
  17import shutil
  18import json
  19from pathlib import Path
  20gdal.UseExceptions()
  21
  22plt.rcParams.update({'font.size': 24})#Set global font size for plot
  23
  24# with open(os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), 'configs', 'global_variables.json'), 'r') as f:
  25#     params_dict = json.load(f)
  26with (Path(__file__).resolve().parent.parent / 'configs' / 'global_variables.json').open('r') as f:
  27    params_dict = json.load(f)
  28
  29DEFAULT_FORMAT = params_dict['DEFAULT_FORMAT']
  30DEFAULT_EXT = params_dict['DEFAULT_EXT']
  31DEFAULT_CMAP = params_dict['DEFAULT_CMAP']
  32
  33MERGED_RASTER_FOLDER = params_dict['MERGED_RASTER_FOLDER']
  34TEMP_RASTER_DOWNLOAD_FOLDER = params_dict['TEMP_RASTER_DOWNLOAD_FOLDER']
  35
  36UNASSIGNED_LABELNAME = params_dict['UNASSIGNED_LABELNAME']
  37ROIDF_LABEL_COL = params_dict['ROIDF_LABEL_COL']
  38CLIPPED_ROI_FOLDER = params_dict['CLIPPED_ROI_FOLDER']
  39IMG_FOLDER = params_dict['IMG_FOLDER']
  40RSTR_COL_IDENTIFIER = params_dict['RSTR_COL_IDENTIFIER']
  41ROIDF_IMG_COL_PATTERN = params_dict['ROIDF_IMG_COL_PATTERN']
  42LABEL_FILE_PATTERN = params_dict['LABEL_FILE_PATTERN']
  43LABEL_FILE_EXT = params_dict['LABEL_FILE_EXT']
  44
  45#DEM GETTER ARGS
  46# with open(os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), 'configs', 'demgetter_variables.json'), 'r') as f:
  47#     dg_dict = json.load(f)
  48with (Path(__file__).resolve().parent.parent / 'configs' / 'demgetter_variables.json').open('r') as f:
  49    dg_dict = json.load(f)
  50DG_DATASET = dg_dict['DG_DATASET']
  51DERIVATIVE_NAMES_DG = dg_dict['DERIVATIVE_NAMES_DG']
  52DEM_GETTER_KWARGS = dg_dict['DEM_GETTER_KWARGS']
  53
  54#The following variables/library are for querying webmaps
  55from owslib.wms import WebMapService
  56# with open(os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), 'configs', 'wms_variables.json'), 'r') as f:
  57#     wms_dict = json.load(f)
  58with (Path(__file__).resolve().parent.parent / 'configs' / 'wms_variables.json').open('r') as f:
  59    wms_dict = json.load(f)
  60WMSPATH = wms_dict['WMSPATH']
  61VERSION = wms_dict['VERSION']
  62RASTERLAYERS = wms_dict['RASTERLAYERS']
  63LAYERNAMES = wms_dict['LAYERNAMES']
  64TILESIZE = tuple(wms_dict['TILESIZE'])
  65WMS_KWARGS = wms_dict['WMS_KWARGS']
  66
  67def get_raster_as_grid(raster):
  68    """Loads in a digital elevation model as a numpy array and converts any no-data to np.nan
  69
  70        Args:
  71            raster (str OR gdal.Dataset): The path to a single band, gdal readable, digital elevation model OR
  72                an already loaded raster dataset
  73
  74        Returns:
  75            rasterGrid (numpy.ndarray): The elevation data stored as a grid
  76            dx (float): The x coordinate spacing of the grid
  77            dy (float): The y coordinate spacing of the grid
  78
  79        Raises:
  80            Exception: Input is neither a path to a raster or a gdal.Dataset
  81        """
  82    
  83    #If this is a raster dataset
  84    if isinstance(raster,gdal.Dataset):
  85        doClose = False
  86    
  87    #If this is a file
  88    elif os.path.isfile(raster):   
  89        #Get the raster grid
  90        raster = gdal.Open(raster)
  91        
  92
  93        #Close this file after the operation completes
  94        doClose = True
  95    else:
  96         Exception('Specified raster is neither a path to a raster or a gdal.Dataset. Please specify a valid path.')
  97
  98    rasterGrid = raster.ReadAsArray().astype(float)
  99    NDV = raster.GetRasterBand(1).GetNoDataValue()
 100
 101    #Mask out NDVs as nan
 102    rasterGrid[rasterGrid==NDV] = np.nan
 103
 104    # Grab the basic header information (xUL, dx, rot1, yUL, rot2, dy)
 105    geotransform = raster.GetGeoTransform()  
 106    
 107    dx = geotransform[1]
 108    dy = geotransform[-1]
 109
 110    if doClose:
 111        raster = None #Close the raster
 112
 113    return rasterGrid, dx, dy
 114
 115def get_spatial_ref_from_raster(rasterFile):
 116    """Extracts spatial reference information and the epsg code as a string from an input raster path.
 117
 118    Args:
 119        pathToRaster (str): The path to a single band, gdal readable, digital elevation model OR
 120        an already loaded raster dataset
 121
 122    Returns:
 123        spatialRef (osgeo.osr.SpatialReference): Spatial reference information
 124        epsg (str): A string representing the EPSG (European Petroleum Survey Group) code associated with the
 125        spatial reference, or None if the EPSG code is not available.
 126    """
 127    
 128    if isinstance(rasterFile, str):
 129        # If the input is a path, open the raster using gdal.Open()
 130        raster = gdal.Open(rasterFile)
 131    elif isinstance(rasterFile, gdal.Dataset):
 132        # If the input is already a raster dataset, use it directly
 133        raster = rasterFile
 134    else:
 135        raise ValueError("Invalid input type. Provide either a path to a raster or a gdal.Dataset.")
 136
 137    spatialRef = osr.SpatialReference(wkt=raster.GetProjection()) #get spatial info as string
 138    epsg = spatialRef.GetAuthorityCode(None) #get epsg code as string
 139    
 140    raster = None #close raster
 141
 142    return spatialRef, epsg
 143
 144def get_spatial_ref_from_shapefile(shapefile):
 145    """Retrieves the spatial reference and EPSG code from a shapefile or GeoDataFrame.
 146
 147    This function accepts either the path to a shapefile or a `GeoDataFrame`. It reads the shapefile (if a path is provided),
 148    extracts the spatial reference as a WKT (Well-Known Text) string, and then converts it to an `osgeo.osr.SpatialReference` object.
 149    The function also retrieves the EPSG code associated with the spatial reference.
 150
 151    Args:
 152        shapefile (str or geopandas.GeoDataFrame): A string path to the shapefile or a `GeoDataFrame` object.
 153
 154    Raises:
 155        ValueError: If the input is neither a valid file path to a shapefile nor a `GeoDataFrame`.
 156
 157    Returns:
 158        tuple: A tuple containing:
 159            - `osgeo.osr.SpatialReference`: The spatial reference object.
 160            - `str`: The EPSG code as a string, or `None` if the EPSG code cannot be determined.
 161    """
 162    if isinstance(shapefile, str):
 163        gdf = gpd.read_file(shapefile)
 164    elif isinstance(shapefile, gpd.GeoDataFrame):
 165        gdf = shapefile
 166    else:
 167        raise ValueError("Invalid input type. Provide either a path to a shapefile or a GeoDataFrame.")
 168
 169    spatialRefWkt = gdf.crs.to_wkt()
 170    spatialRef = osr.SpatialReference(wkt=spatialRefWkt)
 171
 172    #Get epsg code
 173    epsg = spatialRef.GetAuthorityCode(None)
 174    
 175
 176    return spatialRef, epsg
 177
 178def get_polygon_from_file(pathToPolygon:str, epsg:int=None):
 179    """Reads polygon geometries and CRS from a file and optionally reprojects the geometry to specified epsg.
 180    
 181    Args:
 182        pathToPolygon (str): The path to the file containing the polygons.
 183        epsg (int or None): The target EPSG code for reprojection. If None, no reprojection is done.
 184    Returns:
 185        polygon (shapely.geometry.Polygon): A shapely Polygon representing the area within which random ROIs will be generated.
 186        crs (str): The coordinate reference system of the loaded polygon.
 187    """
 188    gdf = gpd.read_file(pathToPolygon)
 189
 190    if epsg is not None:
 191        gdf = gdf.to_crs('EPSG:{}'.format(epsg))
 192
 193    polygon = gdf['geometry'].iloc[0]  
 194    
 195    crs = gdf.crs
 196
 197    return polygon,crs
 198
 199def get_polygon_from_raster(pathToDem:str, demEPSG):
 200    """Constructs a polygon around the boundary of a digital elevation model (DEM) raster.
 201
 202    This function takes the path to a single-band, GDAL-readable digital elevation model 
 203    and constructs a polygon that represents the boundary defined by the minimum and 
 204    maximum x and y values of the raster.
 205
 206    Args:
 207        pathToDem (str): The path to the DEM file.
 208        demEPSG (int): The EPSG code for the coordinate reference system of the DEM.
 209
 210    Returns:
 211        tuple: A tuple containing:
 212            - polygon (Polygon): A Shapely Polygon object representing the boundary of the DEM.
 213            - crs (CRS): The coordinate reference system of the polygon as a GeoPandas CRS object.
 214
 215    """
 216    # Open the DEM file using GDAL
 217    ds = gdal.Open(pathToDem)
 218
 219    # Retrieve the geotransform parameters
 220    xmin, xpixel, _, ymax, _, ypixel = ds.GetGeoTransform()
 221    # Retrieve the geotransform parameters
 222    width, height = ds.RasterXSize, ds.RasterYSize 
 223
 224    # Calculate the maximum x and minimum y values based on the raster dimensions
 225    xmax = xmin + width * xpixel
 226    ymin = ymax + height * ypixel
 227
 228    # Create a polygon using the calculated boundary coordinates
 229    polygon = Polygon([(xmin, ymax), (xmax, ymax), (xmax, ymin), (xmin, ymin), (xmin, ymax)])
 230
 231    # Create a GeoDataFrame to store the polygon with the specified CRS
 232    gdf = gpd.GeoDataFrame(geometry=[polygon], crs='EPSG:{}'.format(demEPSG))
 233
 234    # Extract the polygon geometry from the GeoDataFrame
 235    polygon = gdf['geometry'].iloc[0]
 236    
 237    # Return the polygon and its coordinate reference system
 238    return polygon, gdf.crs
 239
 240def get_spatial_ref_and_polygon(rasterMethod: str, polygonSource: str, pathToPolygon=None, rasterPaths=None):
 241    """Retrieve spatial reference and polygon information based on specified raster and polygon sources.
 242
 243    This function determines the spatial reference and polygon geometry based on the provided 
 244    raster and polygon sources. It supports two methods for raster retrieval ('file' or 'wms') 
 245    and two sources for polygon information ('file' or 'dem').
 246
 247    Args:
 248        rasterMethod (str): Method to retrieve raster information. Options:
 249            - 'file': Use local raster files.
 250            - 'wms': Use a Web Map Service for raster data.
 251        polygonSource (str): Source of the polygon. Options:
 252            - 'file': Load polygon from a file.
 253            - 'dem': Generate polygon from a digital elevation model (DEM).
 254        pathToPolygon (str, optional): Path to the polygon file (required if polygonSource is 'file'). Defaults to None.
 255        rasterPaths (List[str], optional): List of paths to raster files (required if rasterMethod is 'file'). Defaults to None.
 256
 257    Raises:
 258        ValueError: If raster paths are not provided when rasterMethod is 'file'.
 259        ValueError: If a polygon path is not provided when polygonSource is 'file'.
 260        ValueError: If an invalid polygon source is specified (must be 'file' or 'dem').
 261        ValueError: If attempting to extract a polygon from a raster when raster method is 'wms'.
 262        ValueError: If there are issues obtaining spatial reference or polygon information.
 263
 264    Returns:
 265        Tuple[TypeSpatialRef, TypeEPSG, TypePolygon, TypeCRS]: A tuple containing:
 266            - spatialRef: The spatial reference of the raster or polygon.
 267            - epsg: The EPSG code associated with the spatial reference.
 268            - polygon: The geometry of the polygon.
 269            - crs: The coordinate reference system of the polygon.
 270    """
 271    
 272    # Check the raster method and retrieve the spatial reference accordingly
 273    if rasterMethod == 'file':
 274        if rasterPaths is None:
 275            raise ValueError('Must provide raster paths when rasterMethod is "file".')
 276        # Get spatial reference and EPSG code from the first raster file
 277        spatialRef, epsg = get_spatial_ref_from_raster(rasterPaths[0])
 278    
 279    else:  # rasterMethod is 'wms'
 280        if pathToPolygon is None:
 281            raise ValueError('Must provide a polygon path when rasterMethod is "wms".')
 282        # Get spatial reference from the shapefile
 283        spatialRef, epsg = get_spatial_ref_from_shapefile(pathToPolygon)
 284
 285    # Retrieve polygon based on the specified polygon source
 286    if polygonSource == 'file':
 287        if pathToPolygon is None:
 288            raise ValueError('Must provide a polygon path when polygonSource is "file".')
 289        # Get polygon geometry and CRS from the specified file
 290        polygon, crs = get_polygon_from_file(pathToPolygon, epsg)
 291    
 292    elif polygonSource == 'dem':
 293        if rasterMethod == 'wms':
 294            raise ValueError('Cannot extract polygon from raster when raster method is "wms". '
 295                             'Please change the value for polygonSource or rasterMethod.')
 296        # Get polygon geometry from the raster
 297        polygon, crs = get_polygon_from_raster(rasterPaths[0], epsg)
 298    
 299    else:
 300        raise ValueError('Invalid polygon source specified. Valid options are "file" or "dem".')
 301
 302    return spatialRef, epsg, polygon, crs
 303
 304def _request_yn_input(question: str):
 305    '''This is a helper function to get a 'Y'/'N' (e.g., yes/no, true/false)
 306    response to a question from the user via the command prompt. 
 307    
 308    It will continue to query the user untill a valid respons is issued.
 309
 310    Parameters
 311    ----------
 312    question : str
 313        The question that will be asked of the user.
 314
 315    Returns
 316    -------
 317    binaryResponse : TYPE
 318        True false based on if the user respond 'Y' (True) or 'N' (False).
 319
 320    '''
 321    response = str(input(question + '? (Y/N): ')).lower().strip()
 322
 323    if response[:1] == 'y':
 324        binaryResponse = True
 325    elif response[:1] == 'n':
 326        binaryResponse = False
 327    else:
 328        print('Whoopsy, please enter Y or N')
 329        binaryResponse = _request_yn_input()
 330
 331    return binaryResponse
 332
 333def handle_progress(filePath:str, overwrite, interactive):
 334    """Checks for existing file, prompts user to select load or overwrite, and stores selection in load variable. 
 335
 336    Args:
 337        filePath (str): Path to geodatabase where clipped rasters and labels are stored. 
 338
 339    Raises:
 340        ValueError: If user first selects (n) to overwrite and then selects (n) to not proceed with overwrite this error is raised so user can rerun the tool and make the correct selections.
 341
 342    Returns:
 343        load (bool): Returns true if user selects to load temporary data.
 344    """
 345    #Check progress to see if there is there is a temporary file associated with aoiLabel
 346    inProgress = os.path.exists(filePath)
 347
 348    #Prompts user to make selections on how to proceed with temporary data.
 349    if inProgress:
 350        if interactive:
 351            qStr = "File associated with aoi label detected.\nDo you want to continue labeling (y) or overwrite existing progress (n)"
 352            load = _request_yn_input(question = qStr)
 353            if not(load):
 354                confirmStr = "You are about to overwrite progress. Proceed"
 355                confirmOverwrite = _request_yn_input(confirmStr)
 356                if not(confirmOverwrite):
 357                    raise ValueError('User canceled overwrite of dataframe. Try again and correctly specify load or overwrite.')
 358        elif overwrite:
 359            load = False
 360        else:
 361            load = True
 362    else:
 363        load = False #returns False to prompt creation of new rois
 364
 365    return load
 366
 367def rm_non_intersecting_rows(targetDf, intersectsPoly):
 368    """Remove rows from a DataFrame that do not intersect with a specified polygon.
 369
 370    This function iterates through each row of the input DataFrame and checks if the geometry 
 371    of each row intersects with the provided polygon. Rows that do not intersect are marked for 
 372    removal. The resulting DataFrame contains only the rows with geometries that intersect the polygon.
 373
 374    Args:
 375        targetDf (GeoDataFrame): A GeoDataFrame containing geometries to be checked for intersection.
 376        intersectsPoly (Polygon): A Shapely Polygon object used to check for intersections with the geometries.
 377
 378    Returns:
 379        GeoDataFrame: A new GeoDataFrame containing only the rows that intersect with the specified polygon.
 380    """
 381    
 382    # Initialize a list to keep track of indices of rows to drop
 383    rows_to_drop = []
 384    
 385    # Iterate over each row in the DataFrame
 386    for i, row in targetDf.iterrows():
 387        # Check if the geometry of the current row intersects with the specified polygon
 388        intersects = row['geometry'].intersects(intersectsPoly)
 389        
 390        # If there is no intersection, mark the row index for dropping
 391        if not intersects:
 392            rows_to_drop.append(i)
 393    
 394    # Drop the non-intersecting rows from the DataFrame
 395    outDf = targetDf.drop(index=rows_to_drop)
 396    
 397    # Reset the index of the resulting DataFrame
 398    outDf.reset_index(drop=True, inplace=True)
 399    
 400    return outDf
 401
 402def create_rand_rois(polygon, crs:str, numRoi:int, roiWidth:float):
 403    """
 404    Generates randomly located, non intersecting Region of Interest (ROI) polygons that are fully contained within the input polygon.
 405
 406    Args:
 407        polygon (shapely.geometry.Polygon): A shapely Polygon representing the area within which random ROIs will be generated.
 408        crs (str): The coordinate reference system (CRS) for the generated GeoDataFrame.
 409        numRoi (int): The number of random ROIs to create.
 410        roiWidth (float): Width of each ROI polygon in meters.
 411
 412    Returns:
 413        roidf (gpd.GeoDataFrame): A GeoDataFrame containing the random ROIs with the specified crs. 
 414    """
 415    roiGeoms = []
 416
 417    # find xy range for polygon
 418    xmin, ymin, xmax, ymax = polygon.bounds
 419
 420    tryCnt = 0 #Keep up with how many times you've tried an input
 421    maxTries = int(polygon.area / roiWidth**2) #Calculate the maximum possible number of non-overlapping ROis and set value to maxTries
 422    
 423    while len(roiGeoms) < numRoi and tryCnt < maxTries:
 424    
 425        tryCnt+=1 #Increment the tryCnt
 426
 427        # get point within range
 428        xi = np.random.rand(1)[0] * (xmax - xmin) + xmin
 429        yi = np.random.rand(1)[0] * (ymax - ymin) + ymin
 430
 431        # create coordinate pairs for points
 432        roiHalfWidth = roiWidth / 2.0
 433        xL = xi - roiHalfWidth
 434        xR = xi + roiHalfWidth
 435        yB = yi - roiHalfWidth
 436        yT = yi + roiHalfWidth
 437
 438        newRoi = Polygon([(xL, yT), (xR, yT), (xR, yB), (xL, yB), (xL, yT)])
 439
 440        inPolygon = polygon.contains(newRoi)
 441        intersectsExisting = any(existingRoi.intersects(newRoi) for existingRoi in roiGeoms)
 442
 443
 444        # Check if point is within polygon, and if the new ROI intersects with any existing ROIs
 445        if inPolygon and not intersectsExisting:
 446            roiGeoms.append(newRoi)
 447            
 448
 449    #Warns user if they had almost gotten stuck in an infinite loop trying to find ROIs with the specified numROI.
 450    if tryCnt >= maxTries:
 451
 452        warnings.simplefilter("always")
 453        warnings.warn('Could not find requested number of ROIs after {} tries. Found a total of {} ROIs.'.format(tryCnt,len(roiGeoms)))
 454    
 455    roidf = gpd.GeoDataFrame(geometry=roiGeoms, crs=crs)
 456
 457    return roidf
 458
 459def create_rand_grid(polygon, roiWidth:float, crs:str, numRois:int = None):
 460    """Creates a square grid with grid length of roiWidth across the input polygon and assigns a random index
 461    value to each grid cell for random labeling. 
 462
 463    Args:
 464        polygon (shapely.geometry.Polygon): A shapely Polygon representing the area within which random ROIs will be generated.
 465        roiWidth (float): Width of each ROI polygon in meters.
 466        crs (str): The coordinate reference system (CRS) of the input polygon. Used to project resulting grid into same crs.
 467        numRois (int, optional): Number of random subareas to generate.
 468
 469    Returns:
 470        roidf (gpd.GeoDataFrame): A GeoDataFrame containing the random ROIs with the specified crs.
 471    """
 472    #Get the extent of the polygon
 473    xmin, ymin, xmax, ymax = polygon.bounds
 474
 475    xcoords = np.arange(xmin, xmax, roiWidth) #create a list of the bottom left x coordinates for grids
 476    ycoords = np.arange(ymin, ymax, roiWidth) #create a list of the bottom left y coordinates for grids
 477
 478    gridCells = [] #create an empty list to store constructed grid cells
 479
 480    for x in xcoords:
 481        for y in ycoords:
 482            gridCell = Polygon([(x,y),(x, y+ roiWidth),(x+roiWidth, y+roiWidth),(x + roiWidth, y),(x,y)]) #construct a grid polygon starting from bl corner
 483            gridCells.append(gridCell) #add this grid cell to list
 484   
 485   
 486    roidfUnfiltered = gpd.GeoDataFrame(geometry = gridCells, crs = crs) #add grid cells to roidf
 487    
 488    #Removes any polygons that are completely outside the bounding polygon. This is useful for irregular shaped input polygons.
 489    roidf = rm_non_intersecting_rows(roidfUnfiltered, polygon)
 490    n = len(roidf)
 491
 492    #If a subset of ROIS was requested
 493    if numRois:
 494        #Then subset based on numrois
 495        if numRois>n:
 496            numRois = n
 497        randIndex = np.random.choice(n,numRois,replace=False)
 498        roidf = roidf.iloc[randIndex].reset_index(drop=True)
 499    else:
 500        print("Number of grid cells: {}.".format(n))
 501        
 502    return roidf
 503
 504def clip_by_rois(rasterPaths, spatialRef, roidf: gpd.GeoDataFrame, aoiLabel, outDirectory:str):
 505    """"Clips an input raster to the extend of the Regions of Interest (ROIs) from 
 506    a geopandas GeoDataFrame and appends the paths to the corresponding clipped raster
 507    to the ROI GeoDataFrame. 
 508
 509    Args:
 510        rasterPaths (str or list): List of filepaths to raster datasets or a string pointing to one raster.
 511        spatialRef (osgeo.osr.SpatialReference): Desired spatial reference information for clipped rasters. 
 512        Should be the same as input dem and polygons.
 513        roidf (gpd.GeoDataFrame): A GeoDataFrame containing the ROIs for clipping.
 514        aoiLabel (str or int): User assigned label for area of interest.
 515        outDirectory (str): Filepath to the folder where the geodataframe and related components are stored.
 516
 517    Returns:
 518        roidf (gpd.GeoDataFrame): A GeoDataFrame containing ROI polygons with an additional columns
 519        containing file paths to the rasters.
 520    """
 521    
 522    rois = roidf.geometry #access the geometry for rois from roidf
 523    
 524    try:
 525        # Try to unpack the tuple
 526        spatialRef, epsg = spatialRef
 527    except (TypeError, ValueError):
 528        # Handle the case where spatialRef is not a tuple
 529        pass
 530
 531    if isinstance(rasterPaths, list):
 532        for k, rasterPath in enumerate(rasterPaths):
 533            clippedRasters = [] #Initialize a list to store clipped dems
 534
 535            #Looks for a clippedRois folder in raster directory and creates if does not exist
 536            outFolder = os.path.join(outDirectory, CLIPPED_ROI_FOLDER)
 537            if not os.path.exists(outFolder):
 538                os.makedirs(outFolder)
 539
 540            # Extract WKT from the spatialRef tuple
 541            wkt = spatialRef.ExportToWkt()
 542            
 543
 544            for i,roi in enumerate(rois):
 545                xmin, ymin, xmax, ymax = roi.bounds
 546                outExtent = [xmin, ymin, xmax, ymax]
 547                wrpOptions = gdal.WarpOptions(
 548                    outputBounds=outExtent,
 549                    srcSRS=wkt,
 550                    dstSRS=wkt
 551                )
 552
 553                fname = os.path.split(rasterPath)[1]
 554                fname = os.path.splitext(fname)[0]  #gives filename and extension as two outputs
 555                
 556                outPath = os.path.join(outFolder, fname + '_{}'.format(aoiLabel) + '_r{}'.format(i)+DEFAULT_EXT)
 557                gdal.Warp(outPath,rasterPath,options=wrpOptions)
 558                clippedRasters.append(outPath)
 559
 560            #If a raster clip columna associated with this raster index exists, remove it:
 561            if f'rstr_clp{k}' in roidf.columns:
 562                roidf = roidf.drop(f'rstr_clp{k}', axis=1)
 563
 564            #make a new column and assign it the filepaths to the created clipped raster:
 565            roidf[f'rstr_clp{k}'] = clippedRasters
 566    elif isinstance(rasterPaths, str):
 567        clippedRasters = []  # Initialize a list to store clipped DEMs
 568
 569        # Look for a 'clippedRois' folder in the raster directory and create it if it does not exist
 570        outFolder = os.path.join(outDirectory, CLIPPED_ROI_FOLDER)
 571        if not os.path.exists(outFolder):
 572            os.makedirs(outFolder)
 573
 574        # Extract WKT from the spatialRef tuple
 575        wkt = spatialRef.ExportToWkt()
 576
 577        for i, roi in enumerate(rois):
 578            xmin, ymin, xmax, ymax = roi.bounds
 579            outExtent = [xmin, ymin, xmax, ymax]
 580            wrpOptions = gdal.WarpOptions(
 581                outputBounds=outExtent,
 582                srcSRS=wkt,
 583                dstSRS=wkt
 584            )
 585
 586            fname = os.path.split(rasterPaths)[1]
 587            fname = os.path.splitext(fname)[0]  # gives filename and extension as two outputs
 588            
 589            outPath = os.path.join(outFolder, f"{fname}_{aoiLabel}_r{i}{DEFAULT_EXT}")
 590            gdal.Warp(outPath, rasterPaths, options=wrpOptions)
 591            clippedRasters.append(outPath)
 592
 593        #If a column exists named rstr_clp, remove it:
 594        if 'rstr_clp0' in roidf.columns:
 595            roidf = roidf.drop('rstr_clp0', axis=1)
 596
 597        #make a new column and assign it the filepaths to the created clipped raster
 598        roidf['rstr_clp0'] = clippedRasters
 599    return roidf
 600
 601def batch_download_merge_dg(polygon, epsg, aoiLabel, outDirectory, deleteTempFiles= True):
 602    """Batch download and merge digital elevation models (DEMs) based on a specified polygon with the dem_getter.
 603
 604    This function uses the dem_getter to retrieve the AWS paths for DEMs that intersect with the provided polygon, 
 605    download them, and merge them into a single raster file. The output is saved in the specified 
 606    directory with a filename based on the area of interest (AOI) label.
 607
 608    Args:
 609        polygon (Polygon): A Shapely Polygon object defining the area of interest for downloading DEMs.
 610        epsg (int): The EPSG code for the coordinate reference system to be used for the output raster.
 611        aoiLabel (str): A label for the area of interest, used to name the output raster file.
 612        outDirectory (str): The directory where the merged raster file will be saved.
 613        deleteTempFiles (bool, optional): Flag indicating whether to delete temporary files after merging. Defaults to True.
 614
 615    Returns:
 616        str or None: The file path of the merged raster if successful, or None if no paths were found for download.
 617    """
 618    try:
 619        import dem_getter as dg
 620    except ImportError:
 621        print("Failed to import the 'dem_getter' module. Please ensure that you have set up your environment correctly.")
 622        print("For setup instructions, refer to the Setting Up the Dem Getter section in README document.")
 623
 624    
 625    # Extract x and y coordinates from the polygon's exterior
 626    x,y = polygon.exterior.coords.xy
 627
 628    # Get the AWS paths for DEMs that intersect with the polygon
 629    paths = dg.get_aws_paths_polygon(DG_DATASET, x, y, inputEPSG=epsg)
 630
 631    # Define the directory for saving the merged raster
 632    saveDir = os.path.join(outDirectory, MERGED_RASTER_FOLDER)
 633
 634    # Create the directory if it does not exist
 635    if not os.path.exists(saveDir):
 636            os.makedirs(saveDir)
 637
 638    # Proceed if there are paths available for download
 639    if paths:
 640        print("Batch Downloading...")
 641
 642        # Define the temporary directory for downloaded rasters
 643        tempSaveDir = os.path.join(outDirectory, TEMP_RASTER_DOWNLOAD_FOLDER) 
 644        if not os.path.exists(tempSaveDir):
 645            os.makedirs(tempSaveDir)
 646
 647        # Download the DEM files
 648        filelist = dg.batch_download(paths, tempSaveDir, doForceDownload = True)
 649
 650        # Define the output filename and path for the merged raster
 651        fname = f'{aoiLabel}.tif'
 652        outPath = os.path.join(saveDir, fname)
 653
 654        # Define the extent for merging
 655        mergeExtent = ([min(x), max(x)],[min(y),max(y)])
 656
 657        print("Merging DEMS...")
 658
 659        # Merge the downloaded DEMs into a single raster
 660        dg.merge_warp_dems(filelist, outPath, mergeExtent, epsg)
 661
 662        # Optionally delete temporary files after merging
 663        if deleteTempFiles:
 664            if os.path.exists(tempSaveDir):
 665                shutil.rmtree(tempSaveDir)
 666    else:
 667         outPath = None # No paths found for download
 668
 669    return outPath  # Return the path of the merged raster or None
 670
 671
 672def clip_by_rois_demgetter(roidf:gpd.GeoDataFrame, polygon, aoiLabel, outDirectory, spatialRef):
 673    """Connects to the dem getter tool to download rasters and derivatives and crops these by roi polygons.
 674
 675    Args:
 676        roidf (gpd.GeoDataFrame): GeoDataFrame containing ROIs with geometries.
 677        polygon (shapely.geometry.Polygon): A shapely Polygon representing the area within which random ROIs will be generated.
 678        aoiLabel (str): A label to identify the region of interest.
 679        outDirectory (str): Filepath to the folder where the geodataframe and related components are stored.
 680        spatialRef (osgeo.osr.SpatialReference): Desired spatial reference information for clipped rasters. 
 681
 682    Returns:
 683        roidf (gpd.GeoDataFrame): A GeoDataFrame containing ROI polygons with an additional columns containing file 
 684        paths to the rasters.
 685    """
 686
 687    try:
 688        import dem_getter as dg
 689    except ImportError:
 690        print("Failed to import the 'dem_getter' module. Please ensure that you have set up your environment correctly.")
 691        print("For setup instructions, refer to the Setting Up the Dem Getter section in README document.")
 692
 693
 694    # Extract the EPSG code
 695    epsg = spatialRef.GetAuthorityCode(None)
 696    
 697    #Download and merge
 698    demPath = batch_download_merge_dg(polygon, epsg, aoiLabel, outDirectory)
 699
 700    #Specify save location matching the DEM save directory
 701    saveDir = os.path.join(outDirectory, MERGED_RASTER_FOLDER)
 702
 703    #Compute derivatives from the specified list of derivatives to compute
 704    pathList = dg.compute_derivatives(demPath, DERIVATIVE_NAMES_DG, saveDir)
 705
 706    # Create a list of raster paths including derivatives and dems
 707    rasterPathsDg = [demPath]
 708    for derivPath in pathList:
 709        rasterPathsDg.append(derivPath)
 710
 711    #Clip rasters to roi geometry
 712    roidf = clip_by_rois(rasterPathsDg, spatialRef, roidf, aoiLabel, outDirectory)
 713
 714    return roidf
 715
 716def clip_by_rois_wms(wmsPath:str,version:str,rasterLayers:list,layerNames:list,outDirectory:str, roidf: gpd.GeoDataFrame, aoiLabel:str, tilesize:tuple):
 717    """This function connects to a WMS and clips specified raster layers based on the geometries of ROIs in a GeoDataFrame. The
 718    clipped raster files are saved in a specified output directory.
 719
 720    Args:
 721        wmsPath (str): The URL/path to the WMS service.
 722        version (str): The version of the WMS service (e.g., '1.1.1', '1.3.0').
 723        rasterLayers (list): List of raster layers to be clipped from the WMS.
 724        layerNames (list): List of corresponding names for the raster layers.
 725        outDirectory (str): Filepath to the folder where the geodataframe and related components are stored.
 726        roidf (gpd.GeoDataFrame): GeoDataFrame containing ROIs with geometries.
 727        aoiLabel (str): A label to identify the region of interest.
 728        tilesize (tuple): Size of the tiles for WMS requests (width, height).
 729
 730    Returns:
 731        roidf (gpd.GeoDataFrame): The input GeoDataFrame updated with columns containing paths to the clipped raster files.
 732    """
 733    # Connect to the wms
 734    wms = WebMapService(wmsPath,version=version)
 735    
 736    # Access roi geometry
 737    rois = roidf.geometry
 738    
 739    # Iterate over each raster layer and its corresponding name
 740    for k, (rasterLayer,layerName) in enumerate(zip(rasterLayers,layerNames)):
 741        clippedRasters = [] # Initialize a list to store paths of clipped rasters
 742
 743        # Define the output folder for clipped rasters
 744        outFolder = os.path.join(outDirectory, CLIPPED_ROI_FOLDER)
 745
 746        # Create the output folder if it does not exist
 747        if not os.path.exists(outFolder):
 748            os.makedirs(outFolder)
 749
 750        # Iterate over each ROI to clip the raster layer
 751        for i,roi in enumerate(rois):
 752            # Get the bounding box of the current ROI
 753            xmin, ymin, xmax, ymax = roi.bounds
 754            
 755            # Define the output path for the clipped raster
 756            outPath = os.path.join(outFolder, layerName + '_{}'.format(aoiLabel) + '_r{}'.format(i)+DEFAULT_EXT)
 757        
 758            try:
 759                # Request the map from the WMS service using the bounding box and specified parameters
 760                img = wms.getmap(layers = [rasterLayer],
 761                                format=DEFAULT_FORMAT,
 762                                srs='EPSG:{}'.format(roidf.crs.to_epsg()),
 763                                bbox=(xmin,ymin,xmax,ymax),
 764                                size = tilesize)
 765
 766                # Write the retrieved image to the specified output path
 767                with open(outPath,'wb') as out:
 768                    out.write(img.read())
 769                
 770                # Append the output path to the list of clipped rasters
 771                clippedRasters.append(outPath)
 772            except Exception as e:
 773                # Handle potential errors (e.g., timeouts) and log the issue
 774                print(f'Error occurred while clipping raster: {e}')
 775                clippedRasters.append('')
 776        # Add a new column to the GeoDataFrame with the paths of the clipped rasters    
 777        roidf['rstr_clp{}'.format(k)] = clippedRasters
 778    
 779    return roidf # Return the updated GeoDataFrame
 780
 781def min_max_by_percentile(raster, minPercentile=2, maxPercentile=98):
 782    """
 783    Calculate the minimum and maximum values of a raster dataset based on specified percentiles.
 784
 785    Parameters:
 786    - rasterPath (str): Path to the raster file.
 787    - minPercentile (float): Minimum percentile for clipping.
 788    - maxPercentile (float): Maximum percentile for clipping.
 789
 790    Returns:
 791    - clipped_data (numpy.ndarray): Clipped raster data.
 792    """
 793    # Check if the input is a string (indicating a file path)
 794    if isinstance(raster, str):
 795        # Load the raster data as a numpy array and convert any no-data values to np.nan
 796        rasterGrid = get_raster_as_grid(raster)[0]
 797
 798    # Check if the input is already a numpy array
 799    elif isinstance(raster, np.ndarray):
 800        rasterGrid = raster
 801    else:
 802        # Raise an error if the input type is invalid
 803        raise ValueError("Invalid type for raster. Should be either a string (file path) or a numpy array.")
 804
 805    # Calculate the minimum value based on the specified minimum percentile
 806    minVal = np.nanpercentile(rasterGrid, minPercentile)
 807    
 808    # Calculate the maximum value based on the specified maximum percentile
 809    maxVal = np.nanpercentile(rasterGrid, maxPercentile)
 810    
 811    return minVal, maxVal  # Return the calculated minimum and maximum values
 812
 813def derivs_to_png(derivs: list,plotkwargs:list, outDirectory:str, aoiLabel, roinum:int, newSize = None):
 814    """This function takes a list of derivative data, each associated with its own set of plotting parameters, and creates
 815    a composite PNG image. The image is saved to a specified output directory.
 816
 817    Args:
 818        derivs (list): List of derivative data to be visualized.
 819        plotkwargs (list): List of dictionaries containing plotting parameters for each derivative.
 820        outDirectory (str): Filepath to the folder where the output PNG image will be stored.
 821        aoiLabel (str or int): A label to identify the region of interest.
 822        roinum (int): An identifier for the specific region of interest.
 823        newSize (tuple, optional): Tuple specifying the new size of the output image (width, height). Defaults to None.
 824
 825    Returns:
 826        outImagePath (str): Filepath to the saved PNG image.
 827    """
 828     # Create the output folder for images if it does not exist
 829    outFolder = os.path.join(outDirectory, IMG_FOLDER)
 830    if not os.path.exists(outFolder):
 831            os.makedirs(outFolder)
 832    # Define the output image path
 833    outImagePath = os.path.join(outFolder, 'image{}_{}.png'.format(aoiLabel, roinum))
 834    
 835    # Initialize the output image as None
 836    outImage = None
 837
 838    # Iterate over each derivative and its corresponding plotting parameters
 839    for deriv, plotkwarg in zip(derivs,plotkwargs):
 840
 841        # Load the derivative data as a grid
 842        derivGrid = get_raster_as_grid(deriv)[0]
 843
 844        # Check for required scaling arguments; if not present, use percentiles for scaling
 845        if not('vmin' in plotkwarg):
 846            plotkwarg['vmin'] = min_max_by_percentile(derivGrid)[0]
 847        if not('vmax' in plotkwarg):
 848            plotkwarg['vmax'] = min_max_by_percentile(derivGrid)[1]
 849        
 850         # Check for required colormap; if not present, use the default colormap
 851        if not('cmap' in plotkwarg):
 852            plotkwarg['cmap'] = DEFAULT_CMAP
 853        
 854        # Scale the derivative data between 0 and 1 based on min/max values
 855        derivGrid = (derivGrid - plotkwarg['vmin'])/(plotkwarg['vmax'] - plotkwarg['vmin'])
 856        # Saturate values to ensure they stay within the range [0, 1]
 857        derivGrid[derivGrid<=0] = 0
 858        derivGrid[derivGrid>=1] = 1
 859
 860        # Get the colormap function from matplotlib
 861        colormapping = plt.get_cmap(plotkwarg['cmap'])
 862        # Apply the colormap to the derivative grid
 863        derivGrid = colormapping(derivGrid)
 864
 865        # Create a new image from the array and handle alpha transparency if specified
 866        newImage = Image.fromarray((derivGrid * 255).astype(np.uint8))
 867        if 'alpha' in plotkwarg:
 868            newImage.putalpha(int((1-plotkwarg['alpha'])*255))
 869        else:
 870            # Ensure the image is in RGBA format for compositing
 871            newImage = newImage.convert('RGBA')
 872        
 873        # Composite the new image onto the existing output image
 874        if not(outImage):
 875            outImage = newImage # If no image exists, set it as the output image
 876        else:
 877            outImage = Image.alpha_composite(outImage,newImage)  # Composite the new image
 878
 879    # Resize the output image if a new size is specified
 880    if newSize:
 881        outImage = outImage.resize(newSize,resample=3) # 3 for bicubic resampling
 882
 883    # Save the final composite image to the specified output path
 884    outImage.save(outImagePath)
 885    
 886    return outImagePath # Return the path to the saved image
 887
 888def make_imgs(roidf:gpd.GeoDataFrame, plotkwargs:list ,outDirectory:str, aoiLabel):
 889    """Generate PNG images for each region of interest (ROI) based on specified plotting parameters.
 890
 891    This function takes a GeoDataFrame containing regions of interest and associated raster paths and generates PNG
 892    images for each ROI using the `derivs_to_png` function. The image paths are added to the GeoDataFrame.
 893
 894    Args:
 895        roidf (gpd.GeoDataFrame): GeoDataFrame containing ROIs and associated raster paths.
 896        plotkwargs (list): List of dictionaries containing plotting parameters for each derivative.
 897        outDirectory (str): Filepath to the folder where the output PNG images will be stored.
 898        aoiLabel (str or int): A label to identify the region of interest.
 899
 900    Returns:
 901        roidf (gpd.GeoDataFrame): The input GeoDataFrame updated with a column containing paths to the generated PNG images.
 902    """
 903    # Identify columns in the GeoDataFrame that contain raster paths
 904    rasterCols = [col for col in roidf.columns if col.startswith(RSTR_COL_IDENTIFIER)]
 905    imgPaths = []
 906    
 907   # Iterate over each row in the GeoDataFrame
 908    for i, row in roidf.iterrows():
 909        rasterPaths = [] # Initialize a list to store raster paths for the current ROI
 910        for col in roidf[rasterCols]:
 911            path=row[col] # Get the raster path from the current row
 912            rasterPaths.append(path)  # Append the path to the list
 913
 914        # Generate the PNG image using the derivs_to_png function
 915        imgPath = derivs_to_png(rasterPaths,plotkwargs, outDirectory, aoiLabel, i, newSize = None)
 916        imgPaths.append(imgPath) # Append the generated image path to the list
 917    
 918    # Add the image paths to the GeoDataFrame in a new column
 919    roidf[ROIDF_IMG_COL_PATTERN] = imgPaths
 920
 921    return roidf # Return the updated GeoDataFrame
 922
 923def load_df(filePath):
 924    """Loads roidf from filepaths. 
 925
 926    Args:
 927        filePath (str): Path to geodatabase where clipped rasters and labels are stored.
 928
 929    Returns:
 930        roidf (gpd.GeoDataFrame): A GeoDataFrame loaded from the filepath containing ROI polygons and file paths to associated clipped DEMs.
 931    """
 932    # Load the GeoDataFrame from the specified file path
 933    roidf = gpd.read_file(filePath, truncation=False)
 934    roidf = roidf.fillna('') #Fill Nan values with empty strings
 935
 936    recoveryAttempted = False  # Flag to track whether recovery attempt has been made
 937        
 938    # Get raster and image column names
 939    rasterCols = [col for col in roidf.columns if col.startswith(RSTR_COL_IDENTIFIER)]
 940    imageCols = [col for col in roidf.columns if col == 'imgPaths']
 941
 942    # Iterate over each row in the GeoDataFrame
 943    for index, row in roidf.iterrows():
 944        # Check raster file paths
 945        for col in rasterCols:
 946            path = row[col]
 947            if not os.path.exists(path): # If the path does not exist
 948                if not recoveryAttempted:
 949                    print(f'Unable to locate some filepaths. Attempting to recover filepaths within the specified directory.')
 950                    recoveryAttempted = True
 951                directory = os.path.split(filePath)[0] # Get the directory of the file path
 952                fname = os.path.split(path)[1] # Get the filename
 953                tryPath = os.path.join(directory, CLIPPED_ROI_FOLDER, fname) # Construct the recovery path
 954                if os.path.exists(tryPath): # If the recovery path exists
 955                    roidf.at[index, col] = tryPath # Update the GeoDataFrame with the recovery path
 956                else:
 957                    raise ValueError(f"Unable to locate necessary filepaths.\nAttempted Path 1: {path}\nAttempted Path 2: {tryPath}")
 958
 959        # Check image file paths
 960        for col in imageCols:
 961            path = row[col]
 962            if not os.path.exists(path): # If the path does not exist
 963                if not recoveryAttempted:
 964                    print(f'Unable to locate some filepaths. Attempting to recover filepaths within the specified directory.')
 965                    recoveryAttempted = True
 966                directory = os.path.split(filePath)[0]  # Get the directory of the file path
 967                fname = os.path.split(path)[1] # Get the filename
 968                tryPath = os.path.join(directory, IMG_FOLDER, fname) # Construct the recovery path
 969                if os.path.exists(tryPath):  # If the recovery path exists
 970                    roidf.at[index, col] = tryPath  # Update the GeoDataFrame with the recovery path
 971                else:
 972                    raise ValueError(f"Unable to locate necessary filepaths.\nAttempted Path 1: {path}\nAttempted Path 2: {tryPath}")
 973
 974    return roidf # Return the loaded GeoDataFrame
 975
 976def make_roidf(rasterMethod, rasterPaths:list, spatialRef, polygon, crs, roiWidth:float, roiMethod, aoiLabel, outDirectory:str, plotkwargs:list, numRois:int=None, makeImgs:bool = False):
 977    """This function creates a GeoDataFrame with columns for ROI polygons, filepaths to associated clipped rasters,
 978    and images. The ROIs can be generated either randomly around points or in a grid within a specified polygon.
 979    Clipped rasters and images are obtained from either local files or a web map service (WMS).
 980
 981
 982    Args:
 983        rasterMethod (str): Method for getting rasters 'demgetter' if using the dem getter tool, 'file' if uploading raster files manually, or 'wms' if using a web map service. 
 984        rasterPath (list): List of filepaths to raster datasets.
 985        spatialRef (osgeo.osr.SpatialReference): Desired spatial reference information for clipped rasters.
 986        polygon (shapely.geometry.Polygon): A shapely Polygon representing the area within which random ROIs will be generated.
 987        crs (str): The coordinate reference system of the loaded polygon.
 988        roiWidth (float): Width of each ROI polygon in meters.
 989        roiMethod (str): Method used to generate random rois. Options: 'point' or 'grid'. If 'point' rois will be constructed 
 990        around randomly located points. If 'grid' input polygon will be broken into a roiWidth x roiWidth grid with each grid cell being an roi.
 991        aoiLabel (str or int): User assigned label for area of interest.
 992        outDirectory (str): Filepath to the folder where the geodataframe and related components are stored.
 993        plotkwargs (list): List of dictionaries containing plotting parameters for each derivative.
 994        numRois (int, optional): Number of random subareas to generate. Can only be None if roiMethod is grid. In this case will generate rois for entire gridded area.
 995
 996    Raises:
 997        ValueError: Raises error if rasterMethod is not correctly specified as 'file' or 'wms'. 
 998        ValueError: Raises error is roiMethod is not correctly specified as 'point' or 'grid'.
 999
1000    Returns:
1001        roidf (gpd.GeoDataFrame): A geopandas GeoDataFrame containing columns for roi polygons as well as filepaths to associated clipped rasters and images.
1002    """
1003
1004     # Define a mapping of ROI methods to their corresponding functions and arguments
1005    sourceFunc = {
1006    'point': (create_rand_rois,(polygon, crs, numRois, roiWidth)),
1007    'grid': (create_rand_grid, (polygon, roiWidth, crs, numRois))
1008    }
1009     # Check if the specified roiMethod is valid and call the corresponding function
1010    if roiMethod in sourceFunc:
1011        func,args = sourceFunc[roiMethod]
1012        roidf = func(*args) # Generate the ROIs
1013    else:
1014        raise ValueError('Roi method not correctly specified. Valid options: point or grid.')
1015    
1016    # Define a mapping of raster methods to their corresponding functions and arguments
1017    sourceFunc2 = {
1018    'demgetter': (clip_by_rois_demgetter,(roidf, polygon, aoiLabel, outDirectory, spatialRef)),
1019    'file': (clip_by_rois,(rasterPaths, spatialRef, roidf, aoiLabel, outDirectory)),
1020    'wms': (clip_by_rois_wms, (WMSPATH, VERSION, RASTERLAYERS, LAYERNAMES, outDirectory, roidf, aoiLabel, TILESIZE))
1021    }
1022    
1023    # Check if the specified rasterMethod is valid and call the corresponding function
1024    if rasterMethod in sourceFunc2:
1025        func,args = sourceFunc2[rasterMethod]
1026        roidf = func(*args) # Clip the data by rois
1027    else:
1028        raise ValueError('Raster method not correctly specified. Valid options: file or wms.')
1029
1030    # If requested, generate images for the ROIs
1031    if makeImgs:
1032        roidf = make_imgs(roidf, plotkwargs, outDirectory, aoiLabel)
1033
1034    return roidf # Return the created GeoDataFrame
1035
1036def assign_label(rasterPaths:list, labels: list, plotkwargs:list, i:int, selectedLabel:str = None):
1037    """Assign a label to a raster image interactively based on user input.
1038
1039    Args:
1040        rasterPaths (list): A list of file paths to clipped rasters for visualization. 
1041        labels (list): A list of label options for user to choose between. 
1042        plotkwargs (list): List of dictionaries containing plotting parameters for each derivative.
1043        i (int): index number for label
1044        selectedLabel (str, optional): Default label to be pre-selected.. Defaults to None.
1045
1046    Returns:
1047        assignedLabel (str): The label assigned to the raster image based on user input.
1048        goBack (bool): A boolean indicating if the user chose to go back.
1049    """
1050    # Create an empty list to store labels
1051    assignedLabel = []
1052    
1053    #Generate a flag for manual close, automatically set to true
1054    userClose = True
1055
1056    # Define button options including a global variable for an unassigned tile which the labeling window
1057    # starts on by default
1058    buttonOptions = [UNASSIGNED_LABELNAME] + labels
1059  
1060    # Add images to the plot
1061    try:
1062        derivs = [get_raster_as_grid(path)[0] for path in rasterPaths]
1063    except Exception as e:
1064        assignedLabel = "No Data"
1065        print(f'Error getting raster as grid: {e}. Label #{i} Automatically Assigned: {assignedLabel}')
1066        return assignedLabel
1067    
1068    # Initialize plotting environment
1069    fig,axs = plt.subplots(figsize=(20,18))
1070    fig.subplots_adjust(right = 0.65)
1071    
1072    # Loop through derivative and corresponding parameters
1073    for deriv, kwarg in zip(derivs, plotkwargs):
1074        # Determine vmin and vmax for plotting display
1075        if 'vmin' in kwarg:
1076            minVal = kwarg.pop('vmin')
1077        else:
1078            minVal= min_max_by_percentile(deriv)[0]
1079        if 'vmax' in kwarg:
1080            maxVal = kwarg.pop('vmax')
1081        else:
1082            maxVal = min_max_by_percentile(deriv)[1]
1083
1084        # Add rasters to plot
1085        axs.imshow(deriv, vmin = minVal, vmax =maxVal, **kwarg)
1086    
1087    # Set axis labels
1088    axs.set_xlabel("E-W Distance (pixels)")
1089    axs.set_ylabel("N-S Distance (pixels)")
1090
1091    # Define event handlers for user interactions
1092    def handle_radio(label):
1093        """Handles radio button selection."""
1094        if assignedLabel:
1095            assignedLabel.pop() # Remove previous label if exists
1096        assignedLabel.append(label) # Add new label
1097
1098    def submit_and_close(event):
1099        """Handles submission of the selected label."""
1100        nonlocal userClose
1101        if radioButtons.value_selected == UNASSIGNED_LABELNAME:
1102            print('Please Make Valid Selection') # Prompts for valid selection
1103        else:
1104            print(f"Label #{i} Assigned: {radioButtons.value_selected}")
1105            userClose = False # Close the plot
1106            plt.close(fig)
1107
1108    def quit(event):
1109        """Handles quitting the labeling process."""
1110        nonlocal userClose
1111        if assignedLabel:
1112            assignedLabel.pop() # Removes previous label if exists
1113        print("Quit Label Assign")
1114        userClose = False
1115
1116        plt.close(fig)
1117
1118    def on_close(event):
1119        """Handles the event when the plot is closed."""
1120        nonlocal userClose
1121        if userClose:
1122            assignedLabel.pop() # Removes label if user closed the plot
1123            print("Quit Label Assign")
1124            plt.close(fig)
1125
1126    def on_key(event):
1127        """Handles key press events for label selection."""
1128        if event.key == 'enter':
1129            submit_and_close(event) # Submit on enter key
1130        if event.key.isdigit() and 1<= int(event.key) <=len(labels):
1131            labelIndex = int(event.key) - 1 # Get label index from key
1132            label = labels[labelIndex]
1133            radioButtons.set_active(buttonOptions.index(label)) # Set radio button active
1134             
1135    # Display information about key-label mapping
1136    key_mapping = {str(idx + 1): label for idx, label in enumerate(labels)}
1137    key_info = "\n".join([f"[{key}] : {label}" for key, label in key_mapping.items()])
1138
1139    # Display information about key-label mapping
1140    text_properties = {'va': 'center', 'ha': 'left', 
1141                       'bbox': dict(facecolor='white', alpha=0.8, boxstyle='round,pad=0.3')
1142                       }
1143
1144    # Set properties for displaying key information
1145    fig.text(0.67, 0.8, "Keyboard Shortcuts:\n[Enter] : Submit\n{}".format(key_info), transform=plt.gcf().transFigure, **text_properties)
1146    
1147    # Add radio buttons to the plot    
1148    radioAx = fig.add_axes([0.65, 0.4, 0.2, 0.2], frameon=False)
1149    radioButtons = RadioButtons(radioAx,buttonOptions,active=0)
1150    radioButtons.on_clicked(handle_radio)
1151
1152    #If the selectedLabel is present, set it as the default
1153    if selectedLabel:
1154        matchIdx = buttonOptions.index(selectedLabel) # Find index of selected label
1155        radioButtons.set_active(matchIdx)  # Set the radio button to active
1156
1157    #Add submit button functionality
1158    submitButtonAx = fig.add_axes([0.67, 0.25, 0.2, 0.06]) # Define position for submit button
1159    submitButton = Button(submitButtonAx, 'Submit',color='0.8', hovercolor='0.95') # Create submit button
1160    submitButton.on_clicked(submit_and_close) # Link radio button click to submit function
1161
1162    #Add a Quit button
1163    quitButtonAx = fig.add_axes([0.67, 0.175, 0.2, 0.06])
1164    quitButton = Button(quitButtonAx,'Quit',color='0.8', hovercolor='0.95')  # Create quit button
1165    quitButton.on_clicked(quit)  # Link button click to quit function
1166
1167    # Connect key press and close events to their respective handlers
1168    fig.canvas.mpl_connect('key_press_event', on_key) # Handle key presses
1169    fig.canvas.mpl_connect('close_event', on_close)  # Handle plot close event
1170
1171    plt.show(block=True)  # Show the plot and block execution until closed
1172
1173    #Assign outputs
1174    if assignedLabel:
1175        assignedLabel = assignedLabel.pop() # Get the last assigned label if exists
1176
1177    return assignedLabel # Return the assigned label
1178
1179def create_labels(roidf, labels: list, plotkwargs:list, doReview: bool= False, interactive: bool=False):
1180    """Create labels for regions of interest (ROIs) in a GeoDataFrame based on user input. If the input geodataframe is already
1181    populated, gives user option to review labels. 
1182
1183    This function iterates through the clipped DEMs (Digital Elevation Models) in the input GeoDataFrame
1184    and assigns labels to each ROI interactively using the `assign_label` function. The assigned labels
1185    are added to the GeoDataFrame as a ROIDF_LABEL_COL column.
1186
1187    Args:
1188        roidf (gpd.GeoDataFrame): A GeoDataFrame containing ROIs and associated clipped DEMs.
1189        labels (list): A list of label options for user to choose between. 
1190        plotkwargs (list): List of dictionaries containing plotting parameters for each derivative.
1191        doReview (bool, optional): When true, specifies that the user would like to review labels for a complete geodatabase. Defauls to False.
1192        interactive (bool, optional): When true, turns on interactive user questions that allow the user to set overwrite. Defaults to False. 
1193
1194    Returns:
1195        roidf (gpd.GeoDataFrame): The input GeoDataFrame updated with a ROIDF_LABEL_COL column containing labels for each ROI.
1196    """
1197    # Create a label column if it does not already exist
1198    if not (ROIDF_LABEL_COL in roidf.columns):
1199        roidf[ROIDF_LABEL_COL]=['' for i in range(len(roidf))]
1200
1201    # Checks if labeling had already been completed for this gdf. 
1202    labelingCompleted = all(roidf[ROIDF_LABEL_COL] != '')
1203    review = False
1204    save = True
1205
1206    #If labeling has already been completed i.e. no empty labels in the label column, asks the user if they would like to review labels.
1207    if labelingCompleted:
1208        if interactive:
1209            qStr = 'Labeling already completed for this area. Review labels (y/n)'
1210            review = _request_yn_input(question=qStr)
1211        else:
1212            review = doReview
1213        if not review:
1214            save = False
1215            print('Labeling Completed for the file associated with this aoiLabel. If you would like to review the labels, call the run_label_builder function and specify doReview to be True.')
1216
1217    #Loops through label builder if there are empty labels or user selects to review labels
1218    if not labelingCompleted or review:
1219        rasterCols = [col for col in roidf.columns if col.startswith(RSTR_COL_IDENTIFIER)]
1220    
1221        if not review:
1222            rowsToSelect = roidf[ROIDF_LABEL_COL]=='' #Selects only rows with empty labels
1223        else:
1224            rowsToSelect = np.ones(len(roidf))==1 #Selects everything if we are reviewing assignments
1225
1226        # Iterates through selected rows to label
1227        for i,row in roidf[rowsToSelect].iterrows():
1228            rasterPaths = []
1229            for col in rasterCols:
1230                path=row[col]
1231                rasterPaths.append(path)
1232            if not review:
1233                selectedLabel=UNASSIGNED_LABELNAME
1234            else:
1235                selectedLabel = row[ROIDF_LABEL_COL]
1236            label = assign_label(rasterPaths, labels, plotkwargs, i, selectedLabel=selectedLabel) # Brings up pop up window for label assign
1237            print('Value returned: {}'.format(label))
1238            if label:
1239                roidf.loc[i,ROIDF_LABEL_COL] = label
1240            else:
1241                break #Stop loop if label assign action was exited by user.
1242
1243    return roidf, save
1244
1245def save_results(aoiLabel, roidf, outDirectory):
1246    """Save geodataframe as a shape file in output directory. 
1247
1248    Args:
1249        aoiLabel (str or int): A label to identify the region of interest.
1250        roidf (gpd.GeoDataFrame): A geodataframe with columns for polygons, paths to clipped rasters and images, and associated labels. 
1251        outDirectory (str): Filepath to the folder where the geodataframe and related components are stored.
1252
1253    Returns:
1254        saveStr (str): A string to tell the user the file path for the saved geodataframe. 
1255    """
1256    # Set save location
1257    outFilePath = os.path.join(outDirectory,f'{aoiLabel}{LABEL_FILE_PATTERN}{LABEL_FILE_EXT}')
1258
1259    # Create print statement to indicate save location
1260    saveStr = 'Labels saved to: {}.'.format(os.path.abspath(outFilePath), aoiLabel)
1261
1262    #Save
1263    roidf.to_file(outFilePath, truncation=False)
1264
1265    return saveStr
1266
1267
1268def run_label_builder(labels: list, aoiLabel:str, roiWidth:float, outDirectory:str, 
1269                      polygonSource:str = 'file', roiMethod:str = 'point', rasterMethod:str = 'file', 
1270                      plotkwargs:list= None, 
1271                      numRois:int=None, pathToPolygon:str=None, rasterPaths:list=None, 
1272                      overwrite: bool = False, doReview: bool = False, interactive: bool = False, makeImgs:bool = False):
1273    """Run the label builder to interactively assign labels to regions of interest (ROIs).
1274
1275    This function performs the following steps:
1276    1. If the output directory does not exist, it is created.
1277    2. If temporary label data exists for the given AOI label, it is loaded; otherwise, a new GeoDataFrame is created.
1278    3. If loading a temporary GeoDataFrame, the function proceeds to step 5. If not, it generates ROIs based on the specified method.
1279    4. The ROIs are labeled interactively using the `create_labels` function.
1280    5. The progress or results are saved depending on whether labels were appended to the GeoDataFrame or a review was made.
1281
1282    Args:
1283        labels (list): A list of label options for user selection.
1284        aoiLabel (str or int): User-assigned label for the area of interest.
1285        roiWidth (float): Desired width of each ROI polygon in meters.
1286        outDirectory (str): Filepath to the folder where the GeoDataFrame and related components are stored.
1287        polygonSource (str, optional): Source of the polygon for generating ROIs. Options: 'file' or 'dem'. Defaults to 'file'.
1288        roiMethod (str, optional): Method used to generate ROIs. Options: 'point' or 'grid'. Defaults to 'point'.
1289        rasterMethod (str, optional): Method for getting raster data. Options: 'file', 'demgetter', or 'wms'. Defaults to 'file'.
1290        plotkwargs (list, optional): List of dictionaries containing plotting parameters for each derivative. Defaults to None
1291        numRois (int, optional): Number of random subareas to generate. Required if roiMethod is 'point'. Defaults to None.
1292        pathToPolygon (str, optional): Filepath to the polygon. Required if polygonSource is 'file'. Defaults to None.
1293        rasterPaths (list, optional): List of filepaths to raster datasets. Required if rasterMethod is 'file'. Defaults to None.
1294        overwrite (bool, optional): When true, specifies to overwrite existing dataframe if a filepath associated with the aoiLabel is detected. Defaults to False. 
1295        doReview (bool, optional): When true, specifies that the user would like to review labels for a complete geodatabase. Defauls to False.
1296        interactive (bool, optional): When true, turns on interactive user questions that allow the user to set overwrite. Defaults to False.
1297        makeImgs (bool, opional): When true, creates a folder of pngs for each grid visualization. Defaults to False.
1298
1299    Raises:
1300        ValueError: If the polygon source or raster method is not correctly specified.
1301
1302    Returns:
1303        None: The function does not return any value directly, but it performs the necessary steps for interactive labeling.
1304    """
1305    # First, look for incompatible function arguments
1306    if rasterMethod not in ['file', 'wms', 'demgetter']:
1307        raise ValueError(f'Invalid raster method: {rasterMethod}')
1308    if polygonSource not in ['file', 'dem']:
1309        raise ValueError(f'Invalid polygon source: {polygonSource}')
1310    if roiMethod not in ['point','grid']:
1311        raise ValueError(f'Invalid roiMethod: {roiMethod}')
1312    
1313    # Create a folder in the out directory associated with aoiLabel
1314    outFolder = '{}'.format(aoiLabel)
1315    outDirectory = os.path.join(outDirectory, outFolder)
1316    if not os.path.exists(outDirectory):
1317            os.makedirs(outDirectory)
1318    
1319    # Define a save path for the label file
1320    filePath = os.path.join(outDirectory,f'{aoiLabel}{LABEL_FILE_PATTERN}{LABEL_FILE_EXT}')
1321
1322    # Extract spatial information
1323    spatialRef, epsg, polygon, crs = get_spatial_ref_and_polygon(rasterMethod, polygonSource, pathToPolygon, rasterPaths)
1324
1325    # Check if labeling is in progress/finished
1326    load = handle_progress(filePath, overwrite, interactive)
1327
1328    # If no plot keyword arguments are specified try to use a default based on raster method
1329    if plotkwargs is None:
1330        kwargdict = {
1331            'demgetter':DEM_GETTER_KWARGS,
1332            'wms':WMS_KWARGS
1333            }
1334        if rasterMethod == 'file':
1335            raise ValueError("For 'file' rasterMethod, plotkwargs must be specified manually.")
1336        elif rasterMethod in kwargdict:
1337            plotkwargs = kwargdict[rasterMethod]
1338        else:
1339            raise ValueError(f'Unsupported rasterMethod: {rasterMethod}')
1340    
1341    # Based on the presence of an existing file and user selections either load or create a roidf (Region of interest data frame)
1342    if load:
1343        #load label file
1344        roidf = load_df(filePath)
1345    else:
1346        if roiMethod == 'point' and numRois == None:
1347            raise ValueError('User must specify number of rois for point roi method.')
1348        # Create a new roidf
1349        roidf = make_roidf(rasterMethod, rasterPaths, spatialRef, polygon, crs, roiWidth, roiMethod, aoiLabel, outDirectory, plotkwargs, numRois=numRois, makeImgs = makeImgs)
1350    
1351    # Prompt user to assign labels for each area
1352    roidf, save = create_labels(roidf, labels, plotkwargs, doReview= doReview, interactive=interactive)
1353    print(roidf) #Print the df with new labels added after labeling is completed/exited
1354
1355    #Save progress or results depending on if labels were appended to roidf or review were made to a finished gdf
1356    if save:
1357        saveStr = save_results(aoiLabel, roidf, outDirectory)
1358        print(saveStr)
1359
1360    return None
1361
1362
1363
1364if __name__ == '__main__':
1365    import sys
1366    import json
1367    import llb_tools as llbt
1368    
1369    #Connect to the json file with your parameters
1370    params = sys.argv[1]
1371    with open(params, 'r') as f:
1372        params_dict = json.load(f)
1373    
1374    run_label_builder(**params_dict)
DEFAULT_FORMAT = 'image/tiff'
DEFAULT_EXT = '.tiff'
DEFAULT_CMAP = 'gray'
MERGED_RASTER_FOLDER = 'merged_rasters'
TEMP_RASTER_DOWNLOAD_FOLDER = 'temp_raster_files'
UNASSIGNED_LABELNAME = 'No Selection'
ROIDF_LABEL_COL = 'demLabels'
CLIPPED_ROI_FOLDER = 'clippedRois'
IMG_FOLDER = 'imgs'
RSTR_COL_IDENTIFIER = 'rstr_clp'
ROIDF_IMG_COL_PATTERN = 'imgPaths'
LABEL_FILE_PATTERN = '_labels'
LABEL_FILE_EXT = '.shp'
DG_DATASET = 'DEM_1m'
DERIVATIVE_NAMES_DG = ['hillshade', 'slope']
DEM_GETTER_KWARGS = [{'cmap': 'gray', 'alpha': 0}, {'cmap': 'gray'}, {'cmap': 'gray_r', 'alpha': 0.3}]
WMSPATH = 'https://elevation.nationalmap.gov/arcgis/services/3DEPElevation/ImageServer/WMSServer'
VERSION = '1.3.0'
RASTERLAYERS = ['3DEPElevation:None', '3DEPElevation:Hillshade Multidirectional', '3DEPElevation:Slope Degrees']
LAYERNAMES = ['dem', 'hs', 'slope']
TILESIZE = (200, 200)
WMS_KWARGS = [{'cmap': 'gray', 'alpha': 0}, {'cmap': 'gray'}, {'cmap': 'gray_r', 'alpha': 0.4}]
def get_raster_as_grid(raster):
 68def get_raster_as_grid(raster):
 69    """Loads in a digital elevation model as a numpy array and converts any no-data to np.nan
 70
 71        Args:
 72            raster (str OR gdal.Dataset): The path to a single band, gdal readable, digital elevation model OR
 73                an already loaded raster dataset
 74
 75        Returns:
 76            rasterGrid (numpy.ndarray): The elevation data stored as a grid
 77            dx (float): The x coordinate spacing of the grid
 78            dy (float): The y coordinate spacing of the grid
 79
 80        Raises:
 81            Exception: Input is neither a path to a raster or a gdal.Dataset
 82        """
 83    
 84    #If this is a raster dataset
 85    if isinstance(raster,gdal.Dataset):
 86        doClose = False
 87    
 88    #If this is a file
 89    elif os.path.isfile(raster):   
 90        #Get the raster grid
 91        raster = gdal.Open(raster)
 92        
 93
 94        #Close this file after the operation completes
 95        doClose = True
 96    else:
 97         Exception('Specified raster is neither a path to a raster or a gdal.Dataset. Please specify a valid path.')
 98
 99    rasterGrid = raster.ReadAsArray().astype(float)
100    NDV = raster.GetRasterBand(1).GetNoDataValue()
101
102    #Mask out NDVs as nan
103    rasterGrid[rasterGrid==NDV] = np.nan
104
105    # Grab the basic header information (xUL, dx, rot1, yUL, rot2, dy)
106    geotransform = raster.GetGeoTransform()  
107    
108    dx = geotransform[1]
109    dy = geotransform[-1]
110
111    if doClose:
112        raster = None #Close the raster
113
114    return rasterGrid, dx, dy

Loads in a digital elevation model as a numpy array and converts any no-data to np.nan

Arguments:
  • raster (str OR gdal.Dataset): The path to a single band, gdal readable, digital elevation model OR an already loaded raster dataset
Returns:

rasterGrid (numpy.ndarray): The elevation data stored as a grid dx (float): The x coordinate spacing of the grid dy (float): The y coordinate spacing of the grid

Raises:
  • Exception: Input is neither a path to a raster or a gdal.Dataset
def get_spatial_ref_from_raster(rasterFile):
116def get_spatial_ref_from_raster(rasterFile):
117    """Extracts spatial reference information and the epsg code as a string from an input raster path.
118
119    Args:
120        pathToRaster (str): The path to a single band, gdal readable, digital elevation model OR
121        an already loaded raster dataset
122
123    Returns:
124        spatialRef (osgeo.osr.SpatialReference): Spatial reference information
125        epsg (str): A string representing the EPSG (European Petroleum Survey Group) code associated with the
126        spatial reference, or None if the EPSG code is not available.
127    """
128    
129    if isinstance(rasterFile, str):
130        # If the input is a path, open the raster using gdal.Open()
131        raster = gdal.Open(rasterFile)
132    elif isinstance(rasterFile, gdal.Dataset):
133        # If the input is already a raster dataset, use it directly
134        raster = rasterFile
135    else:
136        raise ValueError("Invalid input type. Provide either a path to a raster or a gdal.Dataset.")
137
138    spatialRef = osr.SpatialReference(wkt=raster.GetProjection()) #get spatial info as string
139    epsg = spatialRef.GetAuthorityCode(None) #get epsg code as string
140    
141    raster = None #close raster
142
143    return spatialRef, epsg

Extracts spatial reference information and the epsg code as a string from an input raster path.

Arguments:
  • pathToRaster (str): The path to a single band, gdal readable, digital elevation model OR
  • an already loaded raster dataset
Returns:

spatialRef (osgeo.osr.SpatialReference): Spatial reference information epsg (str): A string representing the EPSG (European Petroleum Survey Group) code associated with the spatial reference, or None if the EPSG code is not available.

def get_spatial_ref_from_shapefile(shapefile):
145def get_spatial_ref_from_shapefile(shapefile):
146    """Retrieves the spatial reference and EPSG code from a shapefile or GeoDataFrame.
147
148    This function accepts either the path to a shapefile or a `GeoDataFrame`. It reads the shapefile (if a path is provided),
149    extracts the spatial reference as a WKT (Well-Known Text) string, and then converts it to an `osgeo.osr.SpatialReference` object.
150    The function also retrieves the EPSG code associated with the spatial reference.
151
152    Args:
153        shapefile (str or geopandas.GeoDataFrame): A string path to the shapefile or a `GeoDataFrame` object.
154
155    Raises:
156        ValueError: If the input is neither a valid file path to a shapefile nor a `GeoDataFrame`.
157
158    Returns:
159        tuple: A tuple containing:
160            - `osgeo.osr.SpatialReference`: The spatial reference object.
161            - `str`: The EPSG code as a string, or `None` if the EPSG code cannot be determined.
162    """
163    if isinstance(shapefile, str):
164        gdf = gpd.read_file(shapefile)
165    elif isinstance(shapefile, gpd.GeoDataFrame):
166        gdf = shapefile
167    else:
168        raise ValueError("Invalid input type. Provide either a path to a shapefile or a GeoDataFrame.")
169
170    spatialRefWkt = gdf.crs.to_wkt()
171    spatialRef = osr.SpatialReference(wkt=spatialRefWkt)
172
173    #Get epsg code
174    epsg = spatialRef.GetAuthorityCode(None)
175    
176
177    return spatialRef, epsg

Retrieves the spatial reference and EPSG code from a shapefile or GeoDataFrame.

This function accepts either the path to a shapefile or a GeoDataFrame. It reads the shapefile (if a path is provided), extracts the spatial reference as a WKT (Well-Known Text) string, and then converts it to an osgeo.osr.SpatialReference object. The function also retrieves the EPSG code associated with the spatial reference.

Arguments:
  • shapefile (str or geopandas.GeoDataFrame): A string path to the shapefile or a GeoDataFrame object.
Raises:
  • ValueError: If the input is neither a valid file path to a shapefile nor a GeoDataFrame.
Returns:

tuple: A tuple containing: - osgeo.osr.SpatialReference: The spatial reference object. - str: The EPSG code as a string, or None if the EPSG code cannot be determined.

def get_polygon_from_file(pathToPolygon: str, epsg: int = None):
179def get_polygon_from_file(pathToPolygon:str, epsg:int=None):
180    """Reads polygon geometries and CRS from a file and optionally reprojects the geometry to specified epsg.
181    
182    Args:
183        pathToPolygon (str): The path to the file containing the polygons.
184        epsg (int or None): The target EPSG code for reprojection. If None, no reprojection is done.
185    Returns:
186        polygon (shapely.geometry.Polygon): A shapely Polygon representing the area within which random ROIs will be generated.
187        crs (str): The coordinate reference system of the loaded polygon.
188    """
189    gdf = gpd.read_file(pathToPolygon)
190
191    if epsg is not None:
192        gdf = gdf.to_crs('EPSG:{}'.format(epsg))
193
194    polygon = gdf['geometry'].iloc[0]  
195    
196    crs = gdf.crs
197
198    return polygon,crs

Reads polygon geometries and CRS from a file and optionally reprojects the geometry to specified epsg.

Arguments:
  • pathToPolygon (str): The path to the file containing the polygons.
  • epsg (int or None): The target EPSG code for reprojection. If None, no reprojection is done.
Returns:

polygon (shapely.geometry.Polygon): A shapely Polygon representing the area within which random ROIs will be generated. crs (str): The coordinate reference system of the loaded polygon.

def get_polygon_from_raster(pathToDem: str, demEPSG):
200def get_polygon_from_raster(pathToDem:str, demEPSG):
201    """Constructs a polygon around the boundary of a digital elevation model (DEM) raster.
202
203    This function takes the path to a single-band, GDAL-readable digital elevation model 
204    and constructs a polygon that represents the boundary defined by the minimum and 
205    maximum x and y values of the raster.
206
207    Args:
208        pathToDem (str): The path to the DEM file.
209        demEPSG (int): The EPSG code for the coordinate reference system of the DEM.
210
211    Returns:
212        tuple: A tuple containing:
213            - polygon (Polygon): A Shapely Polygon object representing the boundary of the DEM.
214            - crs (CRS): The coordinate reference system of the polygon as a GeoPandas CRS object.
215
216    """
217    # Open the DEM file using GDAL
218    ds = gdal.Open(pathToDem)
219
220    # Retrieve the geotransform parameters
221    xmin, xpixel, _, ymax, _, ypixel = ds.GetGeoTransform()
222    # Retrieve the geotransform parameters
223    width, height = ds.RasterXSize, ds.RasterYSize 
224
225    # Calculate the maximum x and minimum y values based on the raster dimensions
226    xmax = xmin + width * xpixel
227    ymin = ymax + height * ypixel
228
229    # Create a polygon using the calculated boundary coordinates
230    polygon = Polygon([(xmin, ymax), (xmax, ymax), (xmax, ymin), (xmin, ymin), (xmin, ymax)])
231
232    # Create a GeoDataFrame to store the polygon with the specified CRS
233    gdf = gpd.GeoDataFrame(geometry=[polygon], crs='EPSG:{}'.format(demEPSG))
234
235    # Extract the polygon geometry from the GeoDataFrame
236    polygon = gdf['geometry'].iloc[0]
237    
238    # Return the polygon and its coordinate reference system
239    return polygon, gdf.crs

Constructs a polygon around the boundary of a digital elevation model (DEM) raster.

This function takes the path to a single-band, GDAL-readable digital elevation model and constructs a polygon that represents the boundary defined by the minimum and maximum x and y values of the raster.

Arguments:
  • pathToDem (str): The path to the DEM file.
  • demEPSG (int): The EPSG code for the coordinate reference system of the DEM.
Returns:

tuple: A tuple containing: - polygon (Polygon): A Shapely Polygon object representing the boundary of the DEM. - crs (CRS): The coordinate reference system of the polygon as a GeoPandas CRS object.

def get_spatial_ref_and_polygon( rasterMethod: str, polygonSource: str, pathToPolygon=None, rasterPaths=None):
241def get_spatial_ref_and_polygon(rasterMethod: str, polygonSource: str, pathToPolygon=None, rasterPaths=None):
242    """Retrieve spatial reference and polygon information based on specified raster and polygon sources.
243
244    This function determines the spatial reference and polygon geometry based on the provided 
245    raster and polygon sources. It supports two methods for raster retrieval ('file' or 'wms') 
246    and two sources for polygon information ('file' or 'dem').
247
248    Args:
249        rasterMethod (str): Method to retrieve raster information. Options:
250            - 'file': Use local raster files.
251            - 'wms': Use a Web Map Service for raster data.
252        polygonSource (str): Source of the polygon. Options:
253            - 'file': Load polygon from a file.
254            - 'dem': Generate polygon from a digital elevation model (DEM).
255        pathToPolygon (str, optional): Path to the polygon file (required if polygonSource is 'file'). Defaults to None.
256        rasterPaths (List[str], optional): List of paths to raster files (required if rasterMethod is 'file'). Defaults to None.
257
258    Raises:
259        ValueError: If raster paths are not provided when rasterMethod is 'file'.
260        ValueError: If a polygon path is not provided when polygonSource is 'file'.
261        ValueError: If an invalid polygon source is specified (must be 'file' or 'dem').
262        ValueError: If attempting to extract a polygon from a raster when raster method is 'wms'.
263        ValueError: If there are issues obtaining spatial reference or polygon information.
264
265    Returns:
266        Tuple[TypeSpatialRef, TypeEPSG, TypePolygon, TypeCRS]: A tuple containing:
267            - spatialRef: The spatial reference of the raster or polygon.
268            - epsg: The EPSG code associated with the spatial reference.
269            - polygon: The geometry of the polygon.
270            - crs: The coordinate reference system of the polygon.
271    """
272    
273    # Check the raster method and retrieve the spatial reference accordingly
274    if rasterMethod == 'file':
275        if rasterPaths is None:
276            raise ValueError('Must provide raster paths when rasterMethod is "file".')
277        # Get spatial reference and EPSG code from the first raster file
278        spatialRef, epsg = get_spatial_ref_from_raster(rasterPaths[0])
279    
280    else:  # rasterMethod is 'wms'
281        if pathToPolygon is None:
282            raise ValueError('Must provide a polygon path when rasterMethod is "wms".')
283        # Get spatial reference from the shapefile
284        spatialRef, epsg = get_spatial_ref_from_shapefile(pathToPolygon)
285
286    # Retrieve polygon based on the specified polygon source
287    if polygonSource == 'file':
288        if pathToPolygon is None:
289            raise ValueError('Must provide a polygon path when polygonSource is "file".')
290        # Get polygon geometry and CRS from the specified file
291        polygon, crs = get_polygon_from_file(pathToPolygon, epsg)
292    
293    elif polygonSource == 'dem':
294        if rasterMethod == 'wms':
295            raise ValueError('Cannot extract polygon from raster when raster method is "wms". '
296                             'Please change the value for polygonSource or rasterMethod.')
297        # Get polygon geometry from the raster
298        polygon, crs = get_polygon_from_raster(rasterPaths[0], epsg)
299    
300    else:
301        raise ValueError('Invalid polygon source specified. Valid options are "file" or "dem".')
302
303    return spatialRef, epsg, polygon, crs

Retrieve spatial reference and polygon information based on specified raster and polygon sources.

This function determines the spatial reference and polygon geometry based on the provided raster and polygon sources. It supports two methods for raster retrieval ('file' or 'wms') and two sources for polygon information ('file' or 'dem').

Arguments:
  • rasterMethod (str): Method to retrieve raster information. Options:
    • 'file': Use local raster files.
    • 'wms': Use a Web Map Service for raster data.
  • polygonSource (str): Source of the polygon. Options:
    • 'file': Load polygon from a file.
    • 'dem': Generate polygon from a digital elevation model (DEM).
  • pathToPolygon (str, optional): Path to the polygon file (required if polygonSource is 'file'). Defaults to None.
  • rasterPaths (List[str], optional): List of paths to raster files (required if rasterMethod is 'file'). Defaults to None.
Raises:
  • ValueError: If raster paths are not provided when rasterMethod is 'file'.
  • ValueError: If a polygon path is not provided when polygonSource is 'file'.
  • ValueError: If an invalid polygon source is specified (must be 'file' or 'dem').
  • ValueError: If attempting to extract a polygon from a raster when raster method is 'wms'.
  • ValueError: If there are issues obtaining spatial reference or polygon information.
Returns:

Tuple[TypeSpatialRef, TypeEPSG, TypePolygon, TypeCRS]: A tuple containing: - spatialRef: The spatial reference of the raster or polygon. - epsg: The EPSG code associated with the spatial reference. - polygon: The geometry of the polygon. - crs: The coordinate reference system of the polygon.

def handle_progress(filePath: str, overwrite, interactive):
334def handle_progress(filePath:str, overwrite, interactive):
335    """Checks for existing file, prompts user to select load or overwrite, and stores selection in load variable. 
336
337    Args:
338        filePath (str): Path to geodatabase where clipped rasters and labels are stored. 
339
340    Raises:
341        ValueError: If user first selects (n) to overwrite and then selects (n) to not proceed with overwrite this error is raised so user can rerun the tool and make the correct selections.
342
343    Returns:
344        load (bool): Returns true if user selects to load temporary data.
345    """
346    #Check progress to see if there is there is a temporary file associated with aoiLabel
347    inProgress = os.path.exists(filePath)
348
349    #Prompts user to make selections on how to proceed with temporary data.
350    if inProgress:
351        if interactive:
352            qStr = "File associated with aoi label detected.\nDo you want to continue labeling (y) or overwrite existing progress (n)"
353            load = _request_yn_input(question = qStr)
354            if not(load):
355                confirmStr = "You are about to overwrite progress. Proceed"
356                confirmOverwrite = _request_yn_input(confirmStr)
357                if not(confirmOverwrite):
358                    raise ValueError('User canceled overwrite of dataframe. Try again and correctly specify load or overwrite.')
359        elif overwrite:
360            load = False
361        else:
362            load = True
363    else:
364        load = False #returns False to prompt creation of new rois
365
366    return load

Checks for existing file, prompts user to select load or overwrite, and stores selection in load variable.

Arguments:
  • filePath (str): Path to geodatabase where clipped rasters and labels are stored.
Raises:
  • ValueError: If user first selects (n) to overwrite and then selects (n) to not proceed with overwrite this error is raised so user can rerun the tool and make the correct selections.
Returns:

load (bool): Returns true if user selects to load temporary data.

def rm_non_intersecting_rows(targetDf, intersectsPoly):
368def rm_non_intersecting_rows(targetDf, intersectsPoly):
369    """Remove rows from a DataFrame that do not intersect with a specified polygon.
370
371    This function iterates through each row of the input DataFrame and checks if the geometry 
372    of each row intersects with the provided polygon. Rows that do not intersect are marked for 
373    removal. The resulting DataFrame contains only the rows with geometries that intersect the polygon.
374
375    Args:
376        targetDf (GeoDataFrame): A GeoDataFrame containing geometries to be checked for intersection.
377        intersectsPoly (Polygon): A Shapely Polygon object used to check for intersections with the geometries.
378
379    Returns:
380        GeoDataFrame: A new GeoDataFrame containing only the rows that intersect with the specified polygon.
381    """
382    
383    # Initialize a list to keep track of indices of rows to drop
384    rows_to_drop = []
385    
386    # Iterate over each row in the DataFrame
387    for i, row in targetDf.iterrows():
388        # Check if the geometry of the current row intersects with the specified polygon
389        intersects = row['geometry'].intersects(intersectsPoly)
390        
391        # If there is no intersection, mark the row index for dropping
392        if not intersects:
393            rows_to_drop.append(i)
394    
395    # Drop the non-intersecting rows from the DataFrame
396    outDf = targetDf.drop(index=rows_to_drop)
397    
398    # Reset the index of the resulting DataFrame
399    outDf.reset_index(drop=True, inplace=True)
400    
401    return outDf

Remove rows from a DataFrame that do not intersect with a specified polygon.

This function iterates through each row of the input DataFrame and checks if the geometry of each row intersects with the provided polygon. Rows that do not intersect are marked for removal. The resulting DataFrame contains only the rows with geometries that intersect the polygon.

Arguments:
  • targetDf (GeoDataFrame): A GeoDataFrame containing geometries to be checked for intersection.
  • intersectsPoly (Polygon): A Shapely Polygon object used to check for intersections with the geometries.
Returns:

GeoDataFrame: A new GeoDataFrame containing only the rows that intersect with the specified polygon.

def create_rand_rois(polygon, crs: str, numRoi: int, roiWidth: float):
403def create_rand_rois(polygon, crs:str, numRoi:int, roiWidth:float):
404    """
405    Generates randomly located, non intersecting Region of Interest (ROI) polygons that are fully contained within the input polygon.
406
407    Args:
408        polygon (shapely.geometry.Polygon): A shapely Polygon representing the area within which random ROIs will be generated.
409        crs (str): The coordinate reference system (CRS) for the generated GeoDataFrame.
410        numRoi (int): The number of random ROIs to create.
411        roiWidth (float): Width of each ROI polygon in meters.
412
413    Returns:
414        roidf (gpd.GeoDataFrame): A GeoDataFrame containing the random ROIs with the specified crs. 
415    """
416    roiGeoms = []
417
418    # find xy range for polygon
419    xmin, ymin, xmax, ymax = polygon.bounds
420
421    tryCnt = 0 #Keep up with how many times you've tried an input
422    maxTries = int(polygon.area / roiWidth**2) #Calculate the maximum possible number of non-overlapping ROis and set value to maxTries
423    
424    while len(roiGeoms) < numRoi and tryCnt < maxTries:
425    
426        tryCnt+=1 #Increment the tryCnt
427
428        # get point within range
429        xi = np.random.rand(1)[0] * (xmax - xmin) + xmin
430        yi = np.random.rand(1)[0] * (ymax - ymin) + ymin
431
432        # create coordinate pairs for points
433        roiHalfWidth = roiWidth / 2.0
434        xL = xi - roiHalfWidth
435        xR = xi + roiHalfWidth
436        yB = yi - roiHalfWidth
437        yT = yi + roiHalfWidth
438
439        newRoi = Polygon([(xL, yT), (xR, yT), (xR, yB), (xL, yB), (xL, yT)])
440
441        inPolygon = polygon.contains(newRoi)
442        intersectsExisting = any(existingRoi.intersects(newRoi) for existingRoi in roiGeoms)
443
444
445        # Check if point is within polygon, and if the new ROI intersects with any existing ROIs
446        if inPolygon and not intersectsExisting:
447            roiGeoms.append(newRoi)
448            
449
450    #Warns user if they had almost gotten stuck in an infinite loop trying to find ROIs with the specified numROI.
451    if tryCnt >= maxTries:
452
453        warnings.simplefilter("always")
454        warnings.warn('Could not find requested number of ROIs after {} tries. Found a total of {} ROIs.'.format(tryCnt,len(roiGeoms)))
455    
456    roidf = gpd.GeoDataFrame(geometry=roiGeoms, crs=crs)
457
458    return roidf

Generates randomly located, non intersecting Region of Interest (ROI) polygons that are fully contained within the input polygon.

Arguments:
  • polygon (shapely.geometry.Polygon): A shapely Polygon representing the area within which random ROIs will be generated.
  • crs (str): The coordinate reference system (CRS) for the generated GeoDataFrame.
  • numRoi (int): The number of random ROIs to create.
  • roiWidth (float): Width of each ROI polygon in meters.
Returns:

roidf (gpd.GeoDataFrame): A GeoDataFrame containing the random ROIs with the specified crs.

def create_rand_grid(polygon, roiWidth: float, crs: str, numRois: int = None):
460def create_rand_grid(polygon, roiWidth:float, crs:str, numRois:int = None):
461    """Creates a square grid with grid length of roiWidth across the input polygon and assigns a random index
462    value to each grid cell for random labeling. 
463
464    Args:
465        polygon (shapely.geometry.Polygon): A shapely Polygon representing the area within which random ROIs will be generated.
466        roiWidth (float): Width of each ROI polygon in meters.
467        crs (str): The coordinate reference system (CRS) of the input polygon. Used to project resulting grid into same crs.
468        numRois (int, optional): Number of random subareas to generate.
469
470    Returns:
471        roidf (gpd.GeoDataFrame): A GeoDataFrame containing the random ROIs with the specified crs.
472    """
473    #Get the extent of the polygon
474    xmin, ymin, xmax, ymax = polygon.bounds
475
476    xcoords = np.arange(xmin, xmax, roiWidth) #create a list of the bottom left x coordinates for grids
477    ycoords = np.arange(ymin, ymax, roiWidth) #create a list of the bottom left y coordinates for grids
478
479    gridCells = [] #create an empty list to store constructed grid cells
480
481    for x in xcoords:
482        for y in ycoords:
483            gridCell = Polygon([(x,y),(x, y+ roiWidth),(x+roiWidth, y+roiWidth),(x + roiWidth, y),(x,y)]) #construct a grid polygon starting from bl corner
484            gridCells.append(gridCell) #add this grid cell to list
485   
486   
487    roidfUnfiltered = gpd.GeoDataFrame(geometry = gridCells, crs = crs) #add grid cells to roidf
488    
489    #Removes any polygons that are completely outside the bounding polygon. This is useful for irregular shaped input polygons.
490    roidf = rm_non_intersecting_rows(roidfUnfiltered, polygon)
491    n = len(roidf)
492
493    #If a subset of ROIS was requested
494    if numRois:
495        #Then subset based on numrois
496        if numRois>n:
497            numRois = n
498        randIndex = np.random.choice(n,numRois,replace=False)
499        roidf = roidf.iloc[randIndex].reset_index(drop=True)
500    else:
501        print("Number of grid cells: {}.".format(n))
502        
503    return roidf

Creates a square grid with grid length of roiWidth across the input polygon and assigns a random index value to each grid cell for random labeling.

Arguments:
  • polygon (shapely.geometry.Polygon): A shapely Polygon representing the area within which random ROIs will be generated.
  • roiWidth (float): Width of each ROI polygon in meters.
  • crs (str): The coordinate reference system (CRS) of the input polygon. Used to project resulting grid into same crs.
  • numRois (int, optional): Number of random subareas to generate.
Returns:

roidf (gpd.GeoDataFrame): A GeoDataFrame containing the random ROIs with the specified crs.

def clip_by_rois( rasterPaths, spatialRef, roidf: geopandas.geodataframe.GeoDataFrame, aoiLabel, outDirectory: str):
505def clip_by_rois(rasterPaths, spatialRef, roidf: gpd.GeoDataFrame, aoiLabel, outDirectory:str):
506    """"Clips an input raster to the extend of the Regions of Interest (ROIs) from 
507    a geopandas GeoDataFrame and appends the paths to the corresponding clipped raster
508    to the ROI GeoDataFrame. 
509
510    Args:
511        rasterPaths (str or list): List of filepaths to raster datasets or a string pointing to one raster.
512        spatialRef (osgeo.osr.SpatialReference): Desired spatial reference information for clipped rasters. 
513        Should be the same as input dem and polygons.
514        roidf (gpd.GeoDataFrame): A GeoDataFrame containing the ROIs for clipping.
515        aoiLabel (str or int): User assigned label for area of interest.
516        outDirectory (str): Filepath to the folder where the geodataframe and related components are stored.
517
518    Returns:
519        roidf (gpd.GeoDataFrame): A GeoDataFrame containing ROI polygons with an additional columns
520        containing file paths to the rasters.
521    """
522    
523    rois = roidf.geometry #access the geometry for rois from roidf
524    
525    try:
526        # Try to unpack the tuple
527        spatialRef, epsg = spatialRef
528    except (TypeError, ValueError):
529        # Handle the case where spatialRef is not a tuple
530        pass
531
532    if isinstance(rasterPaths, list):
533        for k, rasterPath in enumerate(rasterPaths):
534            clippedRasters = [] #Initialize a list to store clipped dems
535
536            #Looks for a clippedRois folder in raster directory and creates if does not exist
537            outFolder = os.path.join(outDirectory, CLIPPED_ROI_FOLDER)
538            if not os.path.exists(outFolder):
539                os.makedirs(outFolder)
540
541            # Extract WKT from the spatialRef tuple
542            wkt = spatialRef.ExportToWkt()
543            
544
545            for i,roi in enumerate(rois):
546                xmin, ymin, xmax, ymax = roi.bounds
547                outExtent = [xmin, ymin, xmax, ymax]
548                wrpOptions = gdal.WarpOptions(
549                    outputBounds=outExtent,
550                    srcSRS=wkt,
551                    dstSRS=wkt
552                )
553
554                fname = os.path.split(rasterPath)[1]
555                fname = os.path.splitext(fname)[0]  #gives filename and extension as two outputs
556                
557                outPath = os.path.join(outFolder, fname + '_{}'.format(aoiLabel) + '_r{}'.format(i)+DEFAULT_EXT)
558                gdal.Warp(outPath,rasterPath,options=wrpOptions)
559                clippedRasters.append(outPath)
560
561            #If a raster clip columna associated with this raster index exists, remove it:
562            if f'rstr_clp{k}' in roidf.columns:
563                roidf = roidf.drop(f'rstr_clp{k}', axis=1)
564
565            #make a new column and assign it the filepaths to the created clipped raster:
566            roidf[f'rstr_clp{k}'] = clippedRasters
567    elif isinstance(rasterPaths, str):
568        clippedRasters = []  # Initialize a list to store clipped DEMs
569
570        # Look for a 'clippedRois' folder in the raster directory and create it if it does not exist
571        outFolder = os.path.join(outDirectory, CLIPPED_ROI_FOLDER)
572        if not os.path.exists(outFolder):
573            os.makedirs(outFolder)
574
575        # Extract WKT from the spatialRef tuple
576        wkt = spatialRef.ExportToWkt()
577
578        for i, roi in enumerate(rois):
579            xmin, ymin, xmax, ymax = roi.bounds
580            outExtent = [xmin, ymin, xmax, ymax]
581            wrpOptions = gdal.WarpOptions(
582                outputBounds=outExtent,
583                srcSRS=wkt,
584                dstSRS=wkt
585            )
586
587            fname = os.path.split(rasterPaths)[1]
588            fname = os.path.splitext(fname)[0]  # gives filename and extension as two outputs
589            
590            outPath = os.path.join(outFolder, f"{fname}_{aoiLabel}_r{i}{DEFAULT_EXT}")
591            gdal.Warp(outPath, rasterPaths, options=wrpOptions)
592            clippedRasters.append(outPath)
593
594        #If a column exists named rstr_clp, remove it:
595        if 'rstr_clp0' in roidf.columns:
596            roidf = roidf.drop('rstr_clp0', axis=1)
597
598        #make a new column and assign it the filepaths to the created clipped raster
599        roidf['rstr_clp0'] = clippedRasters
600    return roidf

"Clips an input raster to the extend of the Regions of Interest (ROIs) from a geopandas GeoDataFrame and appends the paths to the corresponding clipped raster to the ROI GeoDataFrame.

Arguments:
  • rasterPaths (str or list): List of filepaths to raster datasets or a string pointing to one raster.
  • spatialRef (osgeo.osr.SpatialReference): Desired spatial reference information for clipped rasters.
  • Should be the same as input dem and polygons.
  • roidf (gpd.GeoDataFrame): A GeoDataFrame containing the ROIs for clipping.
  • aoiLabel (str or int): User assigned label for area of interest.
  • outDirectory (str): Filepath to the folder where the geodataframe and related components are stored.
Returns:

roidf (gpd.GeoDataFrame): A GeoDataFrame containing ROI polygons with an additional columns containing file paths to the rasters.

def batch_download_merge_dg(polygon, epsg, aoiLabel, outDirectory, deleteTempFiles=True):
602def batch_download_merge_dg(polygon, epsg, aoiLabel, outDirectory, deleteTempFiles= True):
603    """Batch download and merge digital elevation models (DEMs) based on a specified polygon with the dem_getter.
604
605    This function uses the dem_getter to retrieve the AWS paths for DEMs that intersect with the provided polygon, 
606    download them, and merge them into a single raster file. The output is saved in the specified 
607    directory with a filename based on the area of interest (AOI) label.
608
609    Args:
610        polygon (Polygon): A Shapely Polygon object defining the area of interest for downloading DEMs.
611        epsg (int): The EPSG code for the coordinate reference system to be used for the output raster.
612        aoiLabel (str): A label for the area of interest, used to name the output raster file.
613        outDirectory (str): The directory where the merged raster file will be saved.
614        deleteTempFiles (bool, optional): Flag indicating whether to delete temporary files after merging. Defaults to True.
615
616    Returns:
617        str or None: The file path of the merged raster if successful, or None if no paths were found for download.
618    """
619    try:
620        import dem_getter as dg
621    except ImportError:
622        print("Failed to import the 'dem_getter' module. Please ensure that you have set up your environment correctly.")
623        print("For setup instructions, refer to the Setting Up the Dem Getter section in README document.")
624
625    
626    # Extract x and y coordinates from the polygon's exterior
627    x,y = polygon.exterior.coords.xy
628
629    # Get the AWS paths for DEMs that intersect with the polygon
630    paths = dg.get_aws_paths_polygon(DG_DATASET, x, y, inputEPSG=epsg)
631
632    # Define the directory for saving the merged raster
633    saveDir = os.path.join(outDirectory, MERGED_RASTER_FOLDER)
634
635    # Create the directory if it does not exist
636    if not os.path.exists(saveDir):
637            os.makedirs(saveDir)
638
639    # Proceed if there are paths available for download
640    if paths:
641        print("Batch Downloading...")
642
643        # Define the temporary directory for downloaded rasters
644        tempSaveDir = os.path.join(outDirectory, TEMP_RASTER_DOWNLOAD_FOLDER) 
645        if not os.path.exists(tempSaveDir):
646            os.makedirs(tempSaveDir)
647
648        # Download the DEM files
649        filelist = dg.batch_download(paths, tempSaveDir, doForceDownload = True)
650
651        # Define the output filename and path for the merged raster
652        fname = f'{aoiLabel}.tif'
653        outPath = os.path.join(saveDir, fname)
654
655        # Define the extent for merging
656        mergeExtent = ([min(x), max(x)],[min(y),max(y)])
657
658        print("Merging DEMS...")
659
660        # Merge the downloaded DEMs into a single raster
661        dg.merge_warp_dems(filelist, outPath, mergeExtent, epsg)
662
663        # Optionally delete temporary files after merging
664        if deleteTempFiles:
665            if os.path.exists(tempSaveDir):
666                shutil.rmtree(tempSaveDir)
667    else:
668         outPath = None # No paths found for download
669
670    return outPath  # Return the path of the merged raster or None

Batch download and merge digital elevation models (DEMs) based on a specified polygon with the dem_getter.

This function uses the dem_getter to retrieve the AWS paths for DEMs that intersect with the provided polygon, download them, and merge them into a single raster file. The output is saved in the specified directory with a filename based on the area of interest (AOI) label.

Arguments:
  • polygon (Polygon): A Shapely Polygon object defining the area of interest for downloading DEMs.
  • epsg (int): The EPSG code for the coordinate reference system to be used for the output raster.
  • aoiLabel (str): A label for the area of interest, used to name the output raster file.
  • outDirectory (str): The directory where the merged raster file will be saved.
  • deleteTempFiles (bool, optional): Flag indicating whether to delete temporary files after merging. Defaults to True.
Returns:

str or None: The file path of the merged raster if successful, or None if no paths were found for download.

def clip_by_rois_demgetter( roidf: geopandas.geodataframe.GeoDataFrame, polygon, aoiLabel, outDirectory, spatialRef):
673def clip_by_rois_demgetter(roidf:gpd.GeoDataFrame, polygon, aoiLabel, outDirectory, spatialRef):
674    """Connects to the dem getter tool to download rasters and derivatives and crops these by roi polygons.
675
676    Args:
677        roidf (gpd.GeoDataFrame): GeoDataFrame containing ROIs with geometries.
678        polygon (shapely.geometry.Polygon): A shapely Polygon representing the area within which random ROIs will be generated.
679        aoiLabel (str): A label to identify the region of interest.
680        outDirectory (str): Filepath to the folder where the geodataframe and related components are stored.
681        spatialRef (osgeo.osr.SpatialReference): Desired spatial reference information for clipped rasters. 
682
683    Returns:
684        roidf (gpd.GeoDataFrame): A GeoDataFrame containing ROI polygons with an additional columns containing file 
685        paths to the rasters.
686    """
687
688    try:
689        import dem_getter as dg
690    except ImportError:
691        print("Failed to import the 'dem_getter' module. Please ensure that you have set up your environment correctly.")
692        print("For setup instructions, refer to the Setting Up the Dem Getter section in README document.")
693
694
695    # Extract the EPSG code
696    epsg = spatialRef.GetAuthorityCode(None)
697    
698    #Download and merge
699    demPath = batch_download_merge_dg(polygon, epsg, aoiLabel, outDirectory)
700
701    #Specify save location matching the DEM save directory
702    saveDir = os.path.join(outDirectory, MERGED_RASTER_FOLDER)
703
704    #Compute derivatives from the specified list of derivatives to compute
705    pathList = dg.compute_derivatives(demPath, DERIVATIVE_NAMES_DG, saveDir)
706
707    # Create a list of raster paths including derivatives and dems
708    rasterPathsDg = [demPath]
709    for derivPath in pathList:
710        rasterPathsDg.append(derivPath)
711
712    #Clip rasters to roi geometry
713    roidf = clip_by_rois(rasterPathsDg, spatialRef, roidf, aoiLabel, outDirectory)
714
715    return roidf

Connects to the dem getter tool to download rasters and derivatives and crops these by roi polygons.

Arguments:
  • roidf (gpd.GeoDataFrame): GeoDataFrame containing ROIs with geometries.
  • polygon (shapely.geometry.Polygon): A shapely Polygon representing the area within which random ROIs will be generated.
  • aoiLabel (str): A label to identify the region of interest.
  • outDirectory (str): Filepath to the folder where the geodataframe and related components are stored.
  • spatialRef (osgeo.osr.SpatialReference): Desired spatial reference information for clipped rasters.
Returns:

roidf (gpd.GeoDataFrame): A GeoDataFrame containing ROI polygons with an additional columns containing file paths to the rasters.

def clip_by_rois_wms( wmsPath: str, version: str, rasterLayers: list, layerNames: list, outDirectory: str, roidf: geopandas.geodataframe.GeoDataFrame, aoiLabel: str, tilesize: tuple):
717def clip_by_rois_wms(wmsPath:str,version:str,rasterLayers:list,layerNames:list,outDirectory:str, roidf: gpd.GeoDataFrame, aoiLabel:str, tilesize:tuple):
718    """This function connects to a WMS and clips specified raster layers based on the geometries of ROIs in a GeoDataFrame. The
719    clipped raster files are saved in a specified output directory.
720
721    Args:
722        wmsPath (str): The URL/path to the WMS service.
723        version (str): The version of the WMS service (e.g., '1.1.1', '1.3.0').
724        rasterLayers (list): List of raster layers to be clipped from the WMS.
725        layerNames (list): List of corresponding names for the raster layers.
726        outDirectory (str): Filepath to the folder where the geodataframe and related components are stored.
727        roidf (gpd.GeoDataFrame): GeoDataFrame containing ROIs with geometries.
728        aoiLabel (str): A label to identify the region of interest.
729        tilesize (tuple): Size of the tiles for WMS requests (width, height).
730
731    Returns:
732        roidf (gpd.GeoDataFrame): The input GeoDataFrame updated with columns containing paths to the clipped raster files.
733    """
734    # Connect to the wms
735    wms = WebMapService(wmsPath,version=version)
736    
737    # Access roi geometry
738    rois = roidf.geometry
739    
740    # Iterate over each raster layer and its corresponding name
741    for k, (rasterLayer,layerName) in enumerate(zip(rasterLayers,layerNames)):
742        clippedRasters = [] # Initialize a list to store paths of clipped rasters
743
744        # Define the output folder for clipped rasters
745        outFolder = os.path.join(outDirectory, CLIPPED_ROI_FOLDER)
746
747        # Create the output folder if it does not exist
748        if not os.path.exists(outFolder):
749            os.makedirs(outFolder)
750
751        # Iterate over each ROI to clip the raster layer
752        for i,roi in enumerate(rois):
753            # Get the bounding box of the current ROI
754            xmin, ymin, xmax, ymax = roi.bounds
755            
756            # Define the output path for the clipped raster
757            outPath = os.path.join(outFolder, layerName + '_{}'.format(aoiLabel) + '_r{}'.format(i)+DEFAULT_EXT)
758        
759            try:
760                # Request the map from the WMS service using the bounding box and specified parameters
761                img = wms.getmap(layers = [rasterLayer],
762                                format=DEFAULT_FORMAT,
763                                srs='EPSG:{}'.format(roidf.crs.to_epsg()),
764                                bbox=(xmin,ymin,xmax,ymax),
765                                size = tilesize)
766
767                # Write the retrieved image to the specified output path
768                with open(outPath,'wb') as out:
769                    out.write(img.read())
770                
771                # Append the output path to the list of clipped rasters
772                clippedRasters.append(outPath)
773            except Exception as e:
774                # Handle potential errors (e.g., timeouts) and log the issue
775                print(f'Error occurred while clipping raster: {e}')
776                clippedRasters.append('')
777        # Add a new column to the GeoDataFrame with the paths of the clipped rasters    
778        roidf['rstr_clp{}'.format(k)] = clippedRasters
779    
780    return roidf # Return the updated GeoDataFrame

This function connects to a WMS and clips specified raster layers based on the geometries of ROIs in a GeoDataFrame. The clipped raster files are saved in a specified output directory.

Arguments:
  • wmsPath (str): The URL/path to the WMS service.
  • version (str): The version of the WMS service (e.g., '1.1.1', '1.3.0').
  • rasterLayers (list): List of raster layers to be clipped from the WMS.
  • layerNames (list): List of corresponding names for the raster layers.
  • outDirectory (str): Filepath to the folder where the geodataframe and related components are stored.
  • roidf (gpd.GeoDataFrame): GeoDataFrame containing ROIs with geometries.
  • aoiLabel (str): A label to identify the region of interest.
  • tilesize (tuple): Size of the tiles for WMS requests (width, height).
Returns:

roidf (gpd.GeoDataFrame): The input GeoDataFrame updated with columns containing paths to the clipped raster files.

def min_max_by_percentile(raster, minPercentile=2, maxPercentile=98):
782def min_max_by_percentile(raster, minPercentile=2, maxPercentile=98):
783    """
784    Calculate the minimum and maximum values of a raster dataset based on specified percentiles.
785
786    Parameters:
787    - rasterPath (str): Path to the raster file.
788    - minPercentile (float): Minimum percentile for clipping.
789    - maxPercentile (float): Maximum percentile for clipping.
790
791    Returns:
792    - clipped_data (numpy.ndarray): Clipped raster data.
793    """
794    # Check if the input is a string (indicating a file path)
795    if isinstance(raster, str):
796        # Load the raster data as a numpy array and convert any no-data values to np.nan
797        rasterGrid = get_raster_as_grid(raster)[0]
798
799    # Check if the input is already a numpy array
800    elif isinstance(raster, np.ndarray):
801        rasterGrid = raster
802    else:
803        # Raise an error if the input type is invalid
804        raise ValueError("Invalid type for raster. Should be either a string (file path) or a numpy array.")
805
806    # Calculate the minimum value based on the specified minimum percentile
807    minVal = np.nanpercentile(rasterGrid, minPercentile)
808    
809    # Calculate the maximum value based on the specified maximum percentile
810    maxVal = np.nanpercentile(rasterGrid, maxPercentile)
811    
812    return minVal, maxVal  # Return the calculated minimum and maximum values

Calculate the minimum and maximum values of a raster dataset based on specified percentiles.

Parameters:

  • rasterPath (str): Path to the raster file.
  • minPercentile (float): Minimum percentile for clipping.
  • maxPercentile (float): Maximum percentile for clipping.

Returns:

  • clipped_data (numpy.ndarray): Clipped raster data.
def derivs_to_png( derivs: list, plotkwargs: list, outDirectory: str, aoiLabel, roinum: int, newSize=None):
814def derivs_to_png(derivs: list,plotkwargs:list, outDirectory:str, aoiLabel, roinum:int, newSize = None):
815    """This function takes a list of derivative data, each associated with its own set of plotting parameters, and creates
816    a composite PNG image. The image is saved to a specified output directory.
817
818    Args:
819        derivs (list): List of derivative data to be visualized.
820        plotkwargs (list): List of dictionaries containing plotting parameters for each derivative.
821        outDirectory (str): Filepath to the folder where the output PNG image will be stored.
822        aoiLabel (str or int): A label to identify the region of interest.
823        roinum (int): An identifier for the specific region of interest.
824        newSize (tuple, optional): Tuple specifying the new size of the output image (width, height). Defaults to None.
825
826    Returns:
827        outImagePath (str): Filepath to the saved PNG image.
828    """
829     # Create the output folder for images if it does not exist
830    outFolder = os.path.join(outDirectory, IMG_FOLDER)
831    if not os.path.exists(outFolder):
832            os.makedirs(outFolder)
833    # Define the output image path
834    outImagePath = os.path.join(outFolder, 'image{}_{}.png'.format(aoiLabel, roinum))
835    
836    # Initialize the output image as None
837    outImage = None
838
839    # Iterate over each derivative and its corresponding plotting parameters
840    for deriv, plotkwarg in zip(derivs,plotkwargs):
841
842        # Load the derivative data as a grid
843        derivGrid = get_raster_as_grid(deriv)[0]
844
845        # Check for required scaling arguments; if not present, use percentiles for scaling
846        if not('vmin' in plotkwarg):
847            plotkwarg['vmin'] = min_max_by_percentile(derivGrid)[0]
848        if not('vmax' in plotkwarg):
849            plotkwarg['vmax'] = min_max_by_percentile(derivGrid)[1]
850        
851         # Check for required colormap; if not present, use the default colormap
852        if not('cmap' in plotkwarg):
853            plotkwarg['cmap'] = DEFAULT_CMAP
854        
855        # Scale the derivative data between 0 and 1 based on min/max values
856        derivGrid = (derivGrid - plotkwarg['vmin'])/(plotkwarg['vmax'] - plotkwarg['vmin'])
857        # Saturate values to ensure they stay within the range [0, 1]
858        derivGrid[derivGrid<=0] = 0
859        derivGrid[derivGrid>=1] = 1
860
861        # Get the colormap function from matplotlib
862        colormapping = plt.get_cmap(plotkwarg['cmap'])
863        # Apply the colormap to the derivative grid
864        derivGrid = colormapping(derivGrid)
865
866        # Create a new image from the array and handle alpha transparency if specified
867        newImage = Image.fromarray((derivGrid * 255).astype(np.uint8))
868        if 'alpha' in plotkwarg:
869            newImage.putalpha(int((1-plotkwarg['alpha'])*255))
870        else:
871            # Ensure the image is in RGBA format for compositing
872            newImage = newImage.convert('RGBA')
873        
874        # Composite the new image onto the existing output image
875        if not(outImage):
876            outImage = newImage # If no image exists, set it as the output image
877        else:
878            outImage = Image.alpha_composite(outImage,newImage)  # Composite the new image
879
880    # Resize the output image if a new size is specified
881    if newSize:
882        outImage = outImage.resize(newSize,resample=3) # 3 for bicubic resampling
883
884    # Save the final composite image to the specified output path
885    outImage.save(outImagePath)
886    
887    return outImagePath # Return the path to the saved image

This function takes a list of derivative data, each associated with its own set of plotting parameters, and creates a composite PNG image. The image is saved to a specified output directory.

Arguments:
  • derivs (list): List of derivative data to be visualized.
  • plotkwargs (list): List of dictionaries containing plotting parameters for each derivative.
  • outDirectory (str): Filepath to the folder where the output PNG image will be stored.
  • aoiLabel (str or int): A label to identify the region of interest.
  • roinum (int): An identifier for the specific region of interest.
  • newSize (tuple, optional): Tuple specifying the new size of the output image (width, height). Defaults to None.
Returns:

outImagePath (str): Filepath to the saved PNG image.

def make_imgs( roidf: geopandas.geodataframe.GeoDataFrame, plotkwargs: list, outDirectory: str, aoiLabel):
889def make_imgs(roidf:gpd.GeoDataFrame, plotkwargs:list ,outDirectory:str, aoiLabel):
890    """Generate PNG images for each region of interest (ROI) based on specified plotting parameters.
891
892    This function takes a GeoDataFrame containing regions of interest and associated raster paths and generates PNG
893    images for each ROI using the `derivs_to_png` function. The image paths are added to the GeoDataFrame.
894
895    Args:
896        roidf (gpd.GeoDataFrame): GeoDataFrame containing ROIs and associated raster paths.
897        plotkwargs (list): List of dictionaries containing plotting parameters for each derivative.
898        outDirectory (str): Filepath to the folder where the output PNG images will be stored.
899        aoiLabel (str or int): A label to identify the region of interest.
900
901    Returns:
902        roidf (gpd.GeoDataFrame): The input GeoDataFrame updated with a column containing paths to the generated PNG images.
903    """
904    # Identify columns in the GeoDataFrame that contain raster paths
905    rasterCols = [col for col in roidf.columns if col.startswith(RSTR_COL_IDENTIFIER)]
906    imgPaths = []
907    
908   # Iterate over each row in the GeoDataFrame
909    for i, row in roidf.iterrows():
910        rasterPaths = [] # Initialize a list to store raster paths for the current ROI
911        for col in roidf[rasterCols]:
912            path=row[col] # Get the raster path from the current row
913            rasterPaths.append(path)  # Append the path to the list
914
915        # Generate the PNG image using the derivs_to_png function
916        imgPath = derivs_to_png(rasterPaths,plotkwargs, outDirectory, aoiLabel, i, newSize = None)
917        imgPaths.append(imgPath) # Append the generated image path to the list
918    
919    # Add the image paths to the GeoDataFrame in a new column
920    roidf[ROIDF_IMG_COL_PATTERN] = imgPaths
921
922    return roidf # Return the updated GeoDataFrame

Generate PNG images for each region of interest (ROI) based on specified plotting parameters.

This function takes a GeoDataFrame containing regions of interest and associated raster paths and generates PNG images for each ROI using the derivs_to_png function. The image paths are added to the GeoDataFrame.

Arguments:
  • roidf (gpd.GeoDataFrame): GeoDataFrame containing ROIs and associated raster paths.
  • plotkwargs (list): List of dictionaries containing plotting parameters for each derivative.
  • outDirectory (str): Filepath to the folder where the output PNG images will be stored.
  • aoiLabel (str or int): A label to identify the region of interest.
Returns:

roidf (gpd.GeoDataFrame): The input GeoDataFrame updated with a column containing paths to the generated PNG images.

def load_df(filePath):
924def load_df(filePath):
925    """Loads roidf from filepaths. 
926
927    Args:
928        filePath (str): Path to geodatabase where clipped rasters and labels are stored.
929
930    Returns:
931        roidf (gpd.GeoDataFrame): A GeoDataFrame loaded from the filepath containing ROI polygons and file paths to associated clipped DEMs.
932    """
933    # Load the GeoDataFrame from the specified file path
934    roidf = gpd.read_file(filePath, truncation=False)
935    roidf = roidf.fillna('') #Fill Nan values with empty strings
936
937    recoveryAttempted = False  # Flag to track whether recovery attempt has been made
938        
939    # Get raster and image column names
940    rasterCols = [col for col in roidf.columns if col.startswith(RSTR_COL_IDENTIFIER)]
941    imageCols = [col for col in roidf.columns if col == 'imgPaths']
942
943    # Iterate over each row in the GeoDataFrame
944    for index, row in roidf.iterrows():
945        # Check raster file paths
946        for col in rasterCols:
947            path = row[col]
948            if not os.path.exists(path): # If the path does not exist
949                if not recoveryAttempted:
950                    print(f'Unable to locate some filepaths. Attempting to recover filepaths within the specified directory.')
951                    recoveryAttempted = True
952                directory = os.path.split(filePath)[0] # Get the directory of the file path
953                fname = os.path.split(path)[1] # Get the filename
954                tryPath = os.path.join(directory, CLIPPED_ROI_FOLDER, fname) # Construct the recovery path
955                if os.path.exists(tryPath): # If the recovery path exists
956                    roidf.at[index, col] = tryPath # Update the GeoDataFrame with the recovery path
957                else:
958                    raise ValueError(f"Unable to locate necessary filepaths.\nAttempted Path 1: {path}\nAttempted Path 2: {tryPath}")
959
960        # Check image file paths
961        for col in imageCols:
962            path = row[col]
963            if not os.path.exists(path): # If the path does not exist
964                if not recoveryAttempted:
965                    print(f'Unable to locate some filepaths. Attempting to recover filepaths within the specified directory.')
966                    recoveryAttempted = True
967                directory = os.path.split(filePath)[0]  # Get the directory of the file path
968                fname = os.path.split(path)[1] # Get the filename
969                tryPath = os.path.join(directory, IMG_FOLDER, fname) # Construct the recovery path
970                if os.path.exists(tryPath):  # If the recovery path exists
971                    roidf.at[index, col] = tryPath  # Update the GeoDataFrame with the recovery path
972                else:
973                    raise ValueError(f"Unable to locate necessary filepaths.\nAttempted Path 1: {path}\nAttempted Path 2: {tryPath}")
974
975    return roidf # Return the loaded GeoDataFrame

Loads roidf from filepaths.

Arguments:
  • filePath (str): Path to geodatabase where clipped rasters and labels are stored.
Returns:

roidf (gpd.GeoDataFrame): A GeoDataFrame loaded from the filepath containing ROI polygons and file paths to associated clipped DEMs.

def make_roidf( rasterMethod, rasterPaths: list, spatialRef, polygon, crs, roiWidth: float, roiMethod, aoiLabel, outDirectory: str, plotkwargs: list, numRois: int = None, makeImgs: bool = False):
 977def make_roidf(rasterMethod, rasterPaths:list, spatialRef, polygon, crs, roiWidth:float, roiMethod, aoiLabel, outDirectory:str, plotkwargs:list, numRois:int=None, makeImgs:bool = False):
 978    """This function creates a GeoDataFrame with columns for ROI polygons, filepaths to associated clipped rasters,
 979    and images. The ROIs can be generated either randomly around points or in a grid within a specified polygon.
 980    Clipped rasters and images are obtained from either local files or a web map service (WMS).
 981
 982
 983    Args:
 984        rasterMethod (str): Method for getting rasters 'demgetter' if using the dem getter tool, 'file' if uploading raster files manually, or 'wms' if using a web map service. 
 985        rasterPath (list): List of filepaths to raster datasets.
 986        spatialRef (osgeo.osr.SpatialReference): Desired spatial reference information for clipped rasters.
 987        polygon (shapely.geometry.Polygon): A shapely Polygon representing the area within which random ROIs will be generated.
 988        crs (str): The coordinate reference system of the loaded polygon.
 989        roiWidth (float): Width of each ROI polygon in meters.
 990        roiMethod (str): Method used to generate random rois. Options: 'point' or 'grid'. If 'point' rois will be constructed 
 991        around randomly located points. If 'grid' input polygon will be broken into a roiWidth x roiWidth grid with each grid cell being an roi.
 992        aoiLabel (str or int): User assigned label for area of interest.
 993        outDirectory (str): Filepath to the folder where the geodataframe and related components are stored.
 994        plotkwargs (list): List of dictionaries containing plotting parameters for each derivative.
 995        numRois (int, optional): Number of random subareas to generate. Can only be None if roiMethod is grid. In this case will generate rois for entire gridded area.
 996
 997    Raises:
 998        ValueError: Raises error if rasterMethod is not correctly specified as 'file' or 'wms'. 
 999        ValueError: Raises error is roiMethod is not correctly specified as 'point' or 'grid'.
1000
1001    Returns:
1002        roidf (gpd.GeoDataFrame): A geopandas GeoDataFrame containing columns for roi polygons as well as filepaths to associated clipped rasters and images.
1003    """
1004
1005     # Define a mapping of ROI methods to their corresponding functions and arguments
1006    sourceFunc = {
1007    'point': (create_rand_rois,(polygon, crs, numRois, roiWidth)),
1008    'grid': (create_rand_grid, (polygon, roiWidth, crs, numRois))
1009    }
1010     # Check if the specified roiMethod is valid and call the corresponding function
1011    if roiMethod in sourceFunc:
1012        func,args = sourceFunc[roiMethod]
1013        roidf = func(*args) # Generate the ROIs
1014    else:
1015        raise ValueError('Roi method not correctly specified. Valid options: point or grid.')
1016    
1017    # Define a mapping of raster methods to their corresponding functions and arguments
1018    sourceFunc2 = {
1019    'demgetter': (clip_by_rois_demgetter,(roidf, polygon, aoiLabel, outDirectory, spatialRef)),
1020    'file': (clip_by_rois,(rasterPaths, spatialRef, roidf, aoiLabel, outDirectory)),
1021    'wms': (clip_by_rois_wms, (WMSPATH, VERSION, RASTERLAYERS, LAYERNAMES, outDirectory, roidf, aoiLabel, TILESIZE))
1022    }
1023    
1024    # Check if the specified rasterMethod is valid and call the corresponding function
1025    if rasterMethod in sourceFunc2:
1026        func,args = sourceFunc2[rasterMethod]
1027        roidf = func(*args) # Clip the data by rois
1028    else:
1029        raise ValueError('Raster method not correctly specified. Valid options: file or wms.')
1030
1031    # If requested, generate images for the ROIs
1032    if makeImgs:
1033        roidf = make_imgs(roidf, plotkwargs, outDirectory, aoiLabel)
1034
1035    return roidf # Return the created GeoDataFrame

This function creates a GeoDataFrame with columns for ROI polygons, filepaths to associated clipped rasters, and images. The ROIs can be generated either randomly around points or in a grid within a specified polygon. Clipped rasters and images are obtained from either local files or a web map service (WMS).

Arguments:
  • rasterMethod (str): Method for getting rasters 'demgetter' if using the dem getter tool, 'file' if uploading raster files manually, or 'wms' if using a web map service.
  • rasterPath (list): List of filepaths to raster datasets.
  • spatialRef (osgeo.osr.SpatialReference): Desired spatial reference information for clipped rasters.
  • polygon (shapely.geometry.Polygon): A shapely Polygon representing the area within which random ROIs will be generated.
  • crs (str): The coordinate reference system of the loaded polygon.
  • roiWidth (float): Width of each ROI polygon in meters.
  • roiMethod (str): Method used to generate random rois. Options: 'point' or 'grid'. If 'point' rois will be constructed
  • around randomly located points. If 'grid' input polygon will be broken into a roiWidth x roiWidth grid with each grid cell being an roi.
  • aoiLabel (str or int): User assigned label for area of interest.
  • outDirectory (str): Filepath to the folder where the geodataframe and related components are stored.
  • plotkwargs (list): List of dictionaries containing plotting parameters for each derivative.
  • numRois (int, optional): Number of random subareas to generate. Can only be None if roiMethod is grid. In this case will generate rois for entire gridded area.
Raises:
  • ValueError: Raises error if rasterMethod is not correctly specified as 'file' or 'wms'.
  • ValueError: Raises error is roiMethod is not correctly specified as 'point' or 'grid'.
Returns:

roidf (gpd.GeoDataFrame): A geopandas GeoDataFrame containing columns for roi polygons as well as filepaths to associated clipped rasters and images.

def assign_label( rasterPaths: list, labels: list, plotkwargs: list, i: int, selectedLabel: str = None):
1037def assign_label(rasterPaths:list, labels: list, plotkwargs:list, i:int, selectedLabel:str = None):
1038    """Assign a label to a raster image interactively based on user input.
1039
1040    Args:
1041        rasterPaths (list): A list of file paths to clipped rasters for visualization. 
1042        labels (list): A list of label options for user to choose between. 
1043        plotkwargs (list): List of dictionaries containing plotting parameters for each derivative.
1044        i (int): index number for label
1045        selectedLabel (str, optional): Default label to be pre-selected.. Defaults to None.
1046
1047    Returns:
1048        assignedLabel (str): The label assigned to the raster image based on user input.
1049        goBack (bool): A boolean indicating if the user chose to go back.
1050    """
1051    # Create an empty list to store labels
1052    assignedLabel = []
1053    
1054    #Generate a flag for manual close, automatically set to true
1055    userClose = True
1056
1057    # Define button options including a global variable for an unassigned tile which the labeling window
1058    # starts on by default
1059    buttonOptions = [UNASSIGNED_LABELNAME] + labels
1060  
1061    # Add images to the plot
1062    try:
1063        derivs = [get_raster_as_grid(path)[0] for path in rasterPaths]
1064    except Exception as e:
1065        assignedLabel = "No Data"
1066        print(f'Error getting raster as grid: {e}. Label #{i} Automatically Assigned: {assignedLabel}')
1067        return assignedLabel
1068    
1069    # Initialize plotting environment
1070    fig,axs = plt.subplots(figsize=(20,18))
1071    fig.subplots_adjust(right = 0.65)
1072    
1073    # Loop through derivative and corresponding parameters
1074    for deriv, kwarg in zip(derivs, plotkwargs):
1075        # Determine vmin and vmax for plotting display
1076        if 'vmin' in kwarg:
1077            minVal = kwarg.pop('vmin')
1078        else:
1079            minVal= min_max_by_percentile(deriv)[0]
1080        if 'vmax' in kwarg:
1081            maxVal = kwarg.pop('vmax')
1082        else:
1083            maxVal = min_max_by_percentile(deriv)[1]
1084
1085        # Add rasters to plot
1086        axs.imshow(deriv, vmin = minVal, vmax =maxVal, **kwarg)
1087    
1088    # Set axis labels
1089    axs.set_xlabel("E-W Distance (pixels)")
1090    axs.set_ylabel("N-S Distance (pixels)")
1091
1092    # Define event handlers for user interactions
1093    def handle_radio(label):
1094        """Handles radio button selection."""
1095        if assignedLabel:
1096            assignedLabel.pop() # Remove previous label if exists
1097        assignedLabel.append(label) # Add new label
1098
1099    def submit_and_close(event):
1100        """Handles submission of the selected label."""
1101        nonlocal userClose
1102        if radioButtons.value_selected == UNASSIGNED_LABELNAME:
1103            print('Please Make Valid Selection') # Prompts for valid selection
1104        else:
1105            print(f"Label #{i} Assigned: {radioButtons.value_selected}")
1106            userClose = False # Close the plot
1107            plt.close(fig)
1108
1109    def quit(event):
1110        """Handles quitting the labeling process."""
1111        nonlocal userClose
1112        if assignedLabel:
1113            assignedLabel.pop() # Removes previous label if exists
1114        print("Quit Label Assign")
1115        userClose = False
1116
1117        plt.close(fig)
1118
1119    def on_close(event):
1120        """Handles the event when the plot is closed."""
1121        nonlocal userClose
1122        if userClose:
1123            assignedLabel.pop() # Removes label if user closed the plot
1124            print("Quit Label Assign")
1125            plt.close(fig)
1126
1127    def on_key(event):
1128        """Handles key press events for label selection."""
1129        if event.key == 'enter':
1130            submit_and_close(event) # Submit on enter key
1131        if event.key.isdigit() and 1<= int(event.key) <=len(labels):
1132            labelIndex = int(event.key) - 1 # Get label index from key
1133            label = labels[labelIndex]
1134            radioButtons.set_active(buttonOptions.index(label)) # Set radio button active
1135             
1136    # Display information about key-label mapping
1137    key_mapping = {str(idx + 1): label for idx, label in enumerate(labels)}
1138    key_info = "\n".join([f"[{key}] : {label}" for key, label in key_mapping.items()])
1139
1140    # Display information about key-label mapping
1141    text_properties = {'va': 'center', 'ha': 'left', 
1142                       'bbox': dict(facecolor='white', alpha=0.8, boxstyle='round,pad=0.3')
1143                       }
1144
1145    # Set properties for displaying key information
1146    fig.text(0.67, 0.8, "Keyboard Shortcuts:\n[Enter] : Submit\n{}".format(key_info), transform=plt.gcf().transFigure, **text_properties)
1147    
1148    # Add radio buttons to the plot    
1149    radioAx = fig.add_axes([0.65, 0.4, 0.2, 0.2], frameon=False)
1150    radioButtons = RadioButtons(radioAx,buttonOptions,active=0)
1151    radioButtons.on_clicked(handle_radio)
1152
1153    #If the selectedLabel is present, set it as the default
1154    if selectedLabel:
1155        matchIdx = buttonOptions.index(selectedLabel) # Find index of selected label
1156        radioButtons.set_active(matchIdx)  # Set the radio button to active
1157
1158    #Add submit button functionality
1159    submitButtonAx = fig.add_axes([0.67, 0.25, 0.2, 0.06]) # Define position for submit button
1160    submitButton = Button(submitButtonAx, 'Submit',color='0.8', hovercolor='0.95') # Create submit button
1161    submitButton.on_clicked(submit_and_close) # Link radio button click to submit function
1162
1163    #Add a Quit button
1164    quitButtonAx = fig.add_axes([0.67, 0.175, 0.2, 0.06])
1165    quitButton = Button(quitButtonAx,'Quit',color='0.8', hovercolor='0.95')  # Create quit button
1166    quitButton.on_clicked(quit)  # Link button click to quit function
1167
1168    # Connect key press and close events to their respective handlers
1169    fig.canvas.mpl_connect('key_press_event', on_key) # Handle key presses
1170    fig.canvas.mpl_connect('close_event', on_close)  # Handle plot close event
1171
1172    plt.show(block=True)  # Show the plot and block execution until closed
1173
1174    #Assign outputs
1175    if assignedLabel:
1176        assignedLabel = assignedLabel.pop() # Get the last assigned label if exists
1177
1178    return assignedLabel # Return the assigned label

Assign a label to a raster image interactively based on user input.

Arguments:
  • rasterPaths (list): A list of file paths to clipped rasters for visualization.
  • labels (list): A list of label options for user to choose between.
  • plotkwargs (list): List of dictionaries containing plotting parameters for each derivative.
  • i (int): index number for label
  • selectedLabel (str, optional): Default label to be pre-selected.. Defaults to None.
Returns:

assignedLabel (str): The label assigned to the raster image based on user input. goBack (bool): A boolean indicating if the user chose to go back.

def create_labels( roidf, labels: list, plotkwargs: list, doReview: bool = False, interactive: bool = False):
1180def create_labels(roidf, labels: list, plotkwargs:list, doReview: bool= False, interactive: bool=False):
1181    """Create labels for regions of interest (ROIs) in a GeoDataFrame based on user input. If the input geodataframe is already
1182    populated, gives user option to review labels. 
1183
1184    This function iterates through the clipped DEMs (Digital Elevation Models) in the input GeoDataFrame
1185    and assigns labels to each ROI interactively using the `assign_label` function. The assigned labels
1186    are added to the GeoDataFrame as a ROIDF_LABEL_COL column.
1187
1188    Args:
1189        roidf (gpd.GeoDataFrame): A GeoDataFrame containing ROIs and associated clipped DEMs.
1190        labels (list): A list of label options for user to choose between. 
1191        plotkwargs (list): List of dictionaries containing plotting parameters for each derivative.
1192        doReview (bool, optional): When true, specifies that the user would like to review labels for a complete geodatabase. Defauls to False.
1193        interactive (bool, optional): When true, turns on interactive user questions that allow the user to set overwrite. Defaults to False. 
1194
1195    Returns:
1196        roidf (gpd.GeoDataFrame): The input GeoDataFrame updated with a ROIDF_LABEL_COL column containing labels for each ROI.
1197    """
1198    # Create a label column if it does not already exist
1199    if not (ROIDF_LABEL_COL in roidf.columns):
1200        roidf[ROIDF_LABEL_COL]=['' for i in range(len(roidf))]
1201
1202    # Checks if labeling had already been completed for this gdf. 
1203    labelingCompleted = all(roidf[ROIDF_LABEL_COL] != '')
1204    review = False
1205    save = True
1206
1207    #If labeling has already been completed i.e. no empty labels in the label column, asks the user if they would like to review labels.
1208    if labelingCompleted:
1209        if interactive:
1210            qStr = 'Labeling already completed for this area. Review labels (y/n)'
1211            review = _request_yn_input(question=qStr)
1212        else:
1213            review = doReview
1214        if not review:
1215            save = False
1216            print('Labeling Completed for the file associated with this aoiLabel. If you would like to review the labels, call the run_label_builder function and specify doReview to be True.')
1217
1218    #Loops through label builder if there are empty labels or user selects to review labels
1219    if not labelingCompleted or review:
1220        rasterCols = [col for col in roidf.columns if col.startswith(RSTR_COL_IDENTIFIER)]
1221    
1222        if not review:
1223            rowsToSelect = roidf[ROIDF_LABEL_COL]=='' #Selects only rows with empty labels
1224        else:
1225            rowsToSelect = np.ones(len(roidf))==1 #Selects everything if we are reviewing assignments
1226
1227        # Iterates through selected rows to label
1228        for i,row in roidf[rowsToSelect].iterrows():
1229            rasterPaths = []
1230            for col in rasterCols:
1231                path=row[col]
1232                rasterPaths.append(path)
1233            if not review:
1234                selectedLabel=UNASSIGNED_LABELNAME
1235            else:
1236                selectedLabel = row[ROIDF_LABEL_COL]
1237            label = assign_label(rasterPaths, labels, plotkwargs, i, selectedLabel=selectedLabel) # Brings up pop up window for label assign
1238            print('Value returned: {}'.format(label))
1239            if label:
1240                roidf.loc[i,ROIDF_LABEL_COL] = label
1241            else:
1242                break #Stop loop if label assign action was exited by user.
1243
1244    return roidf, save

Create labels for regions of interest (ROIs) in a GeoDataFrame based on user input. If the input geodataframe is already populated, gives user option to review labels.

This function iterates through the clipped DEMs (Digital Elevation Models) in the input GeoDataFrame and assigns labels to each ROI interactively using the assign_label function. The assigned labels are added to the GeoDataFrame as a ROIDF_LABEL_COL column.

Arguments:
  • roidf (gpd.GeoDataFrame): A GeoDataFrame containing ROIs and associated clipped DEMs.
  • labels (list): A list of label options for user to choose between.
  • plotkwargs (list): List of dictionaries containing plotting parameters for each derivative.
  • doReview (bool, optional): When true, specifies that the user would like to review labels for a complete geodatabase. Defauls to False.
  • interactive (bool, optional): When true, turns on interactive user questions that allow the user to set overwrite. Defaults to False.
Returns:

roidf (gpd.GeoDataFrame): The input GeoDataFrame updated with a ROIDF_LABEL_COL column containing labels for each ROI.

def save_results(aoiLabel, roidf, outDirectory):
1246def save_results(aoiLabel, roidf, outDirectory):
1247    """Save geodataframe as a shape file in output directory. 
1248
1249    Args:
1250        aoiLabel (str or int): A label to identify the region of interest.
1251        roidf (gpd.GeoDataFrame): A geodataframe with columns for polygons, paths to clipped rasters and images, and associated labels. 
1252        outDirectory (str): Filepath to the folder where the geodataframe and related components are stored.
1253
1254    Returns:
1255        saveStr (str): A string to tell the user the file path for the saved geodataframe. 
1256    """
1257    # Set save location
1258    outFilePath = os.path.join(outDirectory,f'{aoiLabel}{LABEL_FILE_PATTERN}{LABEL_FILE_EXT}')
1259
1260    # Create print statement to indicate save location
1261    saveStr = 'Labels saved to: {}.'.format(os.path.abspath(outFilePath), aoiLabel)
1262
1263    #Save
1264    roidf.to_file(outFilePath, truncation=False)
1265
1266    return saveStr

Save geodataframe as a shape file in output directory.

Arguments:
  • aoiLabel (str or int): A label to identify the region of interest.
  • roidf (gpd.GeoDataFrame): A geodataframe with columns for polygons, paths to clipped rasters and images, and associated labels.
  • outDirectory (str): Filepath to the folder where the geodataframe and related components are stored.
Returns:

saveStr (str): A string to tell the user the file path for the saved geodataframe.

def run_label_builder( labels: list, aoiLabel: str, roiWidth: float, outDirectory: str, polygonSource: str = 'file', roiMethod: str = 'point', rasterMethod: str = 'file', plotkwargs: list = None, numRois: int = None, pathToPolygon: str = None, rasterPaths: list = None, overwrite: bool = False, doReview: bool = False, interactive: bool = False, makeImgs: bool = False):
1269def run_label_builder(labels: list, aoiLabel:str, roiWidth:float, outDirectory:str, 
1270                      polygonSource:str = 'file', roiMethod:str = 'point', rasterMethod:str = 'file', 
1271                      plotkwargs:list= None, 
1272                      numRois:int=None, pathToPolygon:str=None, rasterPaths:list=None, 
1273                      overwrite: bool = False, doReview: bool = False, interactive: bool = False, makeImgs:bool = False):
1274    """Run the label builder to interactively assign labels to regions of interest (ROIs).
1275
1276    This function performs the following steps:
1277    1. If the output directory does not exist, it is created.
1278    2. If temporary label data exists for the given AOI label, it is loaded; otherwise, a new GeoDataFrame is created.
1279    3. If loading a temporary GeoDataFrame, the function proceeds to step 5. If not, it generates ROIs based on the specified method.
1280    4. The ROIs are labeled interactively using the `create_labels` function.
1281    5. The progress or results are saved depending on whether labels were appended to the GeoDataFrame or a review was made.
1282
1283    Args:
1284        labels (list): A list of label options for user selection.
1285        aoiLabel (str or int): User-assigned label for the area of interest.
1286        roiWidth (float): Desired width of each ROI polygon in meters.
1287        outDirectory (str): Filepath to the folder where the GeoDataFrame and related components are stored.
1288        polygonSource (str, optional): Source of the polygon for generating ROIs. Options: 'file' or 'dem'. Defaults to 'file'.
1289        roiMethod (str, optional): Method used to generate ROIs. Options: 'point' or 'grid'. Defaults to 'point'.
1290        rasterMethod (str, optional): Method for getting raster data. Options: 'file', 'demgetter', or 'wms'. Defaults to 'file'.
1291        plotkwargs (list, optional): List of dictionaries containing plotting parameters for each derivative. Defaults to None
1292        numRois (int, optional): Number of random subareas to generate. Required if roiMethod is 'point'. Defaults to None.
1293        pathToPolygon (str, optional): Filepath to the polygon. Required if polygonSource is 'file'. Defaults to None.
1294        rasterPaths (list, optional): List of filepaths to raster datasets. Required if rasterMethod is 'file'. Defaults to None.
1295        overwrite (bool, optional): When true, specifies to overwrite existing dataframe if a filepath associated with the aoiLabel is detected. Defaults to False. 
1296        doReview (bool, optional): When true, specifies that the user would like to review labels for a complete geodatabase. Defauls to False.
1297        interactive (bool, optional): When true, turns on interactive user questions that allow the user to set overwrite. Defaults to False.
1298        makeImgs (bool, opional): When true, creates a folder of pngs for each grid visualization. Defaults to False.
1299
1300    Raises:
1301        ValueError: If the polygon source or raster method is not correctly specified.
1302
1303    Returns:
1304        None: The function does not return any value directly, but it performs the necessary steps for interactive labeling.
1305    """
1306    # First, look for incompatible function arguments
1307    if rasterMethod not in ['file', 'wms', 'demgetter']:
1308        raise ValueError(f'Invalid raster method: {rasterMethod}')
1309    if polygonSource not in ['file', 'dem']:
1310        raise ValueError(f'Invalid polygon source: {polygonSource}')
1311    if roiMethod not in ['point','grid']:
1312        raise ValueError(f'Invalid roiMethod: {roiMethod}')
1313    
1314    # Create a folder in the out directory associated with aoiLabel
1315    outFolder = '{}'.format(aoiLabel)
1316    outDirectory = os.path.join(outDirectory, outFolder)
1317    if not os.path.exists(outDirectory):
1318            os.makedirs(outDirectory)
1319    
1320    # Define a save path for the label file
1321    filePath = os.path.join(outDirectory,f'{aoiLabel}{LABEL_FILE_PATTERN}{LABEL_FILE_EXT}')
1322
1323    # Extract spatial information
1324    spatialRef, epsg, polygon, crs = get_spatial_ref_and_polygon(rasterMethod, polygonSource, pathToPolygon, rasterPaths)
1325
1326    # Check if labeling is in progress/finished
1327    load = handle_progress(filePath, overwrite, interactive)
1328
1329    # If no plot keyword arguments are specified try to use a default based on raster method
1330    if plotkwargs is None:
1331        kwargdict = {
1332            'demgetter':DEM_GETTER_KWARGS,
1333            'wms':WMS_KWARGS
1334            }
1335        if rasterMethod == 'file':
1336            raise ValueError("For 'file' rasterMethod, plotkwargs must be specified manually.")
1337        elif rasterMethod in kwargdict:
1338            plotkwargs = kwargdict[rasterMethod]
1339        else:
1340            raise ValueError(f'Unsupported rasterMethod: {rasterMethod}')
1341    
1342    # Based on the presence of an existing file and user selections either load or create a roidf (Region of interest data frame)
1343    if load:
1344        #load label file
1345        roidf = load_df(filePath)
1346    else:
1347        if roiMethod == 'point' and numRois == None:
1348            raise ValueError('User must specify number of rois for point roi method.')
1349        # Create a new roidf
1350        roidf = make_roidf(rasterMethod, rasterPaths, spatialRef, polygon, crs, roiWidth, roiMethod, aoiLabel, outDirectory, plotkwargs, numRois=numRois, makeImgs = makeImgs)
1351    
1352    # Prompt user to assign labels for each area
1353    roidf, save = create_labels(roidf, labels, plotkwargs, doReview= doReview, interactive=interactive)
1354    print(roidf) #Print the df with new labels added after labeling is completed/exited
1355
1356    #Save progress or results depending on if labels were appended to roidf or review were made to a finished gdf
1357    if save:
1358        saveStr = save_results(aoiLabel, roidf, outDirectory)
1359        print(saveStr)
1360
1361    return None

Run the label builder to interactively assign labels to regions of interest (ROIs).

This function performs the following steps:

  1. If the output directory does not exist, it is created.
  2. If temporary label data exists for the given AOI label, it is loaded; otherwise, a new GeoDataFrame is created.
  3. If loading a temporary GeoDataFrame, the function proceeds to step 5. If not, it generates ROIs based on the specified method.
  4. The ROIs are labeled interactively using the create_labels function.
  5. The progress or results are saved depending on whether labels were appended to the GeoDataFrame or a review was made.
Arguments:
  • labels (list): A list of label options for user selection.
  • aoiLabel (str or int): User-assigned label for the area of interest.
  • roiWidth (float): Desired width of each ROI polygon in meters.
  • outDirectory (str): Filepath to the folder where the GeoDataFrame and related components are stored.
  • polygonSource (str, optional): Source of the polygon for generating ROIs. Options: 'file' or 'dem'. Defaults to 'file'.
  • roiMethod (str, optional): Method used to generate ROIs. Options: 'point' or 'grid'. Defaults to 'point'.
  • rasterMethod (str, optional): Method for getting raster data. Options: 'file', 'demgetter', or 'wms'. Defaults to 'file'.
  • plotkwargs (list, optional): List of dictionaries containing plotting parameters for each derivative. Defaults to None
  • numRois (int, optional): Number of random subareas to generate. Required if roiMethod is 'point'. Defaults to None.
  • pathToPolygon (str, optional): Filepath to the polygon. Required if polygonSource is 'file'. Defaults to None.
  • rasterPaths (list, optional): List of filepaths to raster datasets. Required if rasterMethod is 'file'. Defaults to None.
  • overwrite (bool, optional): When true, specifies to overwrite existing dataframe if a filepath associated with the aoiLabel is detected. Defaults to False.
  • doReview (bool, optional): When true, specifies that the user would like to review labels for a complete geodatabase. Defauls to False.
  • interactive (bool, optional): When true, turns on interactive user questions that allow the user to set overwrite. Defaults to False.
  • makeImgs (bool, opional): When true, creates a folder of pngs for each grid visualization. Defaults to False.
Raises:
  • ValueError: If the polygon source or raster method is not correctly specified.
Returns:

None: The function does not return any value directly, but it performs the necessary steps for interactive labeling.