I WAS JUST USING THE FILTER FUNCTION this past week to clean up a nasty looking phone number field from a web source. The field in question allowed the user to type in any alphanumeric data they wanted: Cell, Extension, ATTN, Attention, Mobile, etc. It also allowed too many digits, too few digits, parenthesis, dashes, and anything else an end user could type of in the field. As a result, the field contained a lot of useless information and those records had to be fixed manually or just be dumped.
The saving factor was the Filter function, which allows the data field to be scrubbed of anything except what you want left. So, to clean out all the excess, it was a simple matter of applying this function:
And all of the excess alphanumeric information was gone, including extra spaces and all formatting. So, “Cell: (504) 123-5432 CAll after four” became “5041235432”.
From that point, I could format the data any way I wished. And since the final data output required an 11 digit number starting with a 1 and no other characters, the process was pretty easy from that point.
Stuart Gripman of FullCityConsulting provides some excellent pointers on the use of the Filter function in his recent blog post:
In keeping with a text parsing theme I seem to have going here, this week’s function is the fabulous Filter function. It’s a text strainer that only permits the characters you specify to come out the other end. But you must be cautious, Filter behaves in very specific ways that might trip you up. Here’s the basic format.
Here, our text to examine is a phone number and a name. Our characters to include are the ten digits zero through nine. Filter will examine each character in the text to the left and retain only those that appear in the filter text to the right. All other characters are caught in the filter and don’t get through. So, as you’ve probably worked out by now, the result of the above calculation is “8675309″.
Follow the link to learn more.