Pies, Lies and AIs

Exploring the world of data and organisational intransigence

azurecoder

Jul 23, 20244 min read

Richard vs the Microsoft Speech SDK round 2

When the Python Speech SDK fails on Linux

59 views

11 comments

dazfuller

May 1, 20222 min read

Dropping a SQL table in your Synapse Spark notebooks (Python Edition)

So since writing the original post about how to drop a SQL table from a Spark Notebook I've been meaning to follow it up with a version...

1,120 views

13 comments

dazfuller

Oct 21, 20214 min read

Dropping a SQL table in your Synapse Spark notebooks

For the Python version of the code below, see the follow-up post. One of the nice things with Spark Pools in Azure Synapse Analytics is...

2,506 views

11 comments

dazfuller

Aug 26, 20216 min read

Documentation the easy way

So a slight departure from Spark (sort of) for this post, but I wanted to look at one of the most commonly overlooked aspects of building...

473 views

3 comments

dazfuller

Jul 24, 20219 min read

Processing Excel Data using Spark with Azure Synapse Analytics

Having recently released the Excel data source for Spark 3, I wanted to follow up with a "lets use it to process some Excel data" post....

5,863 views

5 comments

dazfuller

Jul 3, 20214 min read

Using Spark to read from Excel

People have data in Excel, so lets have a look at how we can read that data using Spark

14,704 views

4 comments

dazfuller

May 22, 20213 min read

Just one more column, what could go wrong?

Sometimes, when you go scanning through the documentation for Spark, you come across notes about certain functions. These tend to offer...

2,172 views

6 comments

dazfuller

Apr 25, 20216 min read

Why leave bad data to chance?

Something that we often see as Spark jobs are moved into production is that handling of bad data is either ignored, or a lot of effort...

163 views

4 comments

dazfuller

Feb 21, 20213 min read

Pivot, Step, Pivot, Twist, Un-pivot

Getting data into a good shape is a key part to Data Engineering, and we often get data in all sorts of shape and quality

108 views

4 comments

dazfuller

Feb 15, 20216 min read

When in doubt, shell out

The command line is a powerful environment that lets you do a lot of work quickly, easily, and in a repeatable way

70 views

4 comments

Home: Blog2

Home: Subscribe

CONTACT

Home: Contact