edges_cal.xrfi.xrfi_medfilt

edges_cal.xrfi.xrfi_medfilt(spectrum: ndarray, threshold: float = 6, flags: ndarray | None = None, kf: int = 8, kt: int = 8, inplace: bool = True, max_iter: int = 1, poly_order: int = 0, accumulate: bool = False, use_meanfilt: bool = True) → tuple[numpy.ndarray, dict[str, Any]][source]

Generate RFI flags for a given spectrum using a median filter.

Parameters:

spectrum (array-like) – Either a 1D array of shape (NFREQS,) or a 2D array of shape (NTIMES, NFREQS) defining the measured raw spectrum. If 2D, a 2D filter in freq*time will be applied by default. One can perform the filter just over frequency (in the case that NTIMES > 1) by setting kt=0.
threshold (float, optional) – Number of effective sigma at which to clip RFI.
flags (array-like, optional) – Boolean array of pre-existing flagged data to ignore in the filtering.
kt, kf (tuple of int/None) – The half-size of the kernel to convolve (eg. kernel size over frequency will be 2*kt+1). Value of zero (for any dimension) omits that axis from the kernel, effectively applying the detrending for each subarray along that axis. Value of None will effectively (but slowly) perform a median along the entire axis before running the kernel over the other axis.
inplace (bool, optional) – If True, and flags are given, update the flags in-place instead of creating a new array.
max_iter (int, optional) – Maximum number of iterations to perform. Each iteration uses the flags of the previous iteration to achieve a more robust estimate of the flags. Multiple iterations are more useful if poly_order > 0.
poly_order (int, optional) – If greater than 0, fits a polynomial to the spectrum before performing the median filter. Only allowed if spectrum is 1D. This is useful for getting the number of false positives down. If max_iter>1, the polynomial will be refit on each iteration (using new flags).
accumulate (bool,optional) – If True, on each iteration, accumulate flags. Otherwise, use only flags from the previous iteration and then forget about them. Recommended to be False.
use_meanfilt (bool, optional) – Whether to apply a mean filter after the median filter. The median filter is good at getting RFI, but can also pick up non-RFI if the spectrum is steep compared to the noise. The mean filter is better at only getting RFI if the RFI has already been flagged.

Returns:

flags (array-like) – Boolean array of the same shape as spectrum indicated which channels/times have flagged RFI.

Notes

The default combination of using a median filter followed by a mean filter works quite well. The median filter works quite well at picking up large RFI (wrt to the noise level), but can also create false positives if the noise level is small wrt the steepness of the slope. Following by a flagged mean filter tends to remove these false positives (as it doesn’t get pinned to zero when the function is monotonic).

It is unclear whether performing an iterative filtering is very useful unless using a polynomial subtraction. With polynomial subtraction, one should likely use at least a few iterations, without accumulation, so that the polynomial is not skewed by the as-yet-unflagged RFI.

Choice of kernel size can be important. The wider the kernel, the more “signal-to-noise” one will get on the RFI. Also, if there is a bunch of RFI all clumped together, it will definitely be missed by a kernel window of order double the size of the clump or less. By increasing the kernel size, these clumps are picked up, but edge-effects become more prevalent in this case. One option here would be to iterate over kernel sizes (getting smaller), such that very large blobs are first flagged out, then progressively finer detail is added. Use xrfi_iterative_medfilt for that.