Python script wrote for use as a function for Caseware’s IDEA.
This script defines a function numTidy that takes a string representation of a number and cleans it by removing unwanted characters and ensuring the number is properly formatted.
How It Works:
The line cleaned_str = re.sub(r'[^\d.-]’, ”, number_str) uses regular expressions to remove any characters from the input string that are not digits (\d), a negative sign (
-
), or a decimal point (.
)
Next section checks if there is more than one negative sign. If so, it keeps only the first negative sign at the front by using cleaned_str = ‘-‘ + cleaned_str.replace(‘-‘, ”, 1).
If the string contains multiple decimal points, this part splits the string at the first decimal point and joins the remaining parts, allowing only one decimal point.
The line cleaned_str = re.sub(r'[^0-9]+\Z’, ”, cleaned_str) removes any non-numeric characters that appear at the end of the string after numbers.
The function then attempts to convert the cleaned string into a floating-point number using float(cleaned_st
r)
. If successful, the number is returned as a string.
## Usage:
This function can be used in various contexts, such as cleaning user inputs that may include non-numeric characters, to ensure valid numerical data. In IDEA, the function would be called using @Python(“numTidy”, [data]).
PYTHON numTidy.txt