When you were younger, did you ever plan to Trick or Treat in a particular neighborhood in the hopes of scoring the most and best sweet treats of the year? It was a tough decision to weigh: A) Target the large, expensive homes in hopes of obtaining the full-size candy bars knowing that you have to walk much further between homes. B) Target the large neighborhoods with smaller, closer homes knowing that you probably won’t get the full-size treats, but you are almost certain to get a lot of smaller treats.
There’s a saying in the Alteryx world that’s stuck with me: “If you find yourself repeating a task, make a macro.”
I have about 15 macros in daily use, of which I use 3-4 in most of my modules. The macro I’m sharing with you now actually replaces a native tool: Unique. I found myself constantly trying to figure out why the duplicates were being removed. Why would customers have multiple transactions in a day? Why would a property sell in the same month? Why are there multiple products with the same price?
Do you have a large database that you want to sample?
When the need arises to grab just a few records from a large database, it helps if you already know the records you are going to pick. You wouldn’t pull in an entire database if you only wanted to look at the state of Indiana or Female customers Aged 20-24. You would apply these filters in Calgary or SQL before extracting to speed up the process. Unfortunately, there isn’t a way to filter a random set of records directly from an input… Until now.
In this first post I wanted to address a common theme from several posts I’ve seen in the Alteryx community forum. “Dirty Data” can bring an analysis to a halt before it even begins, especially if stray delimiters are present in the input file. It causes errors in the module which will prevent any further data from being imported from the source.
Error: Input Data (3): Error reading “Dirty Dataset_Dirty.csv”: Too many fields in record #1
Luckily there are several workarounds for this problem.