azurecoderJul 23, 20244 min readRichard vs the Microsoft Speech SDK round 2When the Python Speech SDK fails on Linux
dazfullerMay 1, 20222 min readDropping a SQL table in your Synapse Spark notebooks (Python Edition)So since writing the original post about how to drop a SQL table from a Spark Notebook I've been meaning to follow it up with a version...
dazfullerOct 21, 20214 min readDropping a SQL table in your Synapse Spark notebooksFor the Python version of the code below, see the follow-up post. One of the nice things with Spark Pools in Azure Synapse Analytics is...
dazfullerAug 26, 20216 min readDocumentation the easy waySo a slight departure from Spark (sort of) for this post, but I wanted to look at one of the most commonly overlooked aspects of building...
dazfullerJul 24, 20219 min readProcessing Excel Data using Spark with Azure Synapse AnalyticsHaving recently released the Excel data source for Spark 3, I wanted to follow up with a "lets use it to process some Excel data" post....
dazfullerJul 3, 20214 min readUsing Spark to read from ExcelPeople have data in Excel, so lets have a look at how we can read that data using Spark
dazfullerMay 22, 20213 min readJust one more column, what could go wrong?Sometimes, when you go scanning through the documentation for Spark, you come across notes about certain functions. These tend to offer...
dazfullerApr 25, 20216 min readWhy leave bad data to chance?Something that we often see as Spark jobs are moved into production is that handling of bad data is either ignored, or a lot of effort...
dazfullerFeb 21, 20213 min readPivot, Step, Pivot, Twist, Un-pivotGetting data into a good shape is a key part to Data Engineering, and we often get data in all sorts of shape and quality
dazfullerFeb 15, 20216 min readWhen in doubt, shell outThe command line is a powerful environment that lets you do a lot of work quickly, easily, and in a repeatable way