Code munging AI megabeast
- azurecoder
- Jul 20, 2024
- 7 min read
So ... It's Elastacloud's summer of code hack. It was my idea because I wanted to take some time away from dispute resolution, excel spreadsheets, lawyers and accountants. As much as I love the enduring challenge of where the nut allergy sign should sit in the kitchen I wanted to contribute to the betterment of humanity with AI by classifying how many teams meetings some of our customers have that are a complete waste of time and the attributable financial cost. Lofty goals, but then I've always set a high bar for myself.
And so the problems start. My team are building the video engine which will "crack" videos in our flagship AI product Knowledge Miner. Loads of challenges to do this at scale. Some bright spark reading this will no doubt say why don't you just use Microsoft co-pilot. That silver bullet is more like a lead cannon ball. Go waste some money and tell your boss AI is a turd wrapped in a gold leaf. Square peg - round rabbit hole.
Anyway, turns out if you haven't written code in 2 years you get rusty and not of the nail variety but more like a hundred monkeys trying to write the complete works of Shakespeare. Some of us aged members of the nerd race, know that the infinite improbability drive can do this - but my point is proven.
Anyway again, so I managed to build an application all on my very loansome which works in react and flask and pulls apart videos, transcribes them, classifies them, works out who is speaking some of the time (not telling how I do this one hehe, the struggle here is worth it though). It was working beautifully locally but given Knowledge Miner will need to process thousands of videos concurrently the easiest route was my old chestnut friend Azure Batch.
Elastacloud started out life as a a distributed computing and HPC consultancy. I used to be a whizz at using the HPC scheduler and later on Azure Batch. Turns out not so much now. So ... I thought, I'm writing code again let me use AI, ChatGPT knows everything. Turns out it has this sadistic side so I wanted to document my 14 hour ordeal. If it had a face I swear it would be like The Joker with going to hallucinate martha fokker emblazoned across its sweaty forehead only visible with a blacklight.
For those that don't know, Azure Batch allows you to create VMs with a specific build by running a startup task to configure all the VMs across a Pool so that you can have loads of VMs and the same build across all of them. All very nice. Then you can send your software in the form of a job with many tasks which are queued until there is capacity across the pool. There you go, I'm better at this than Microsoft documentation already!
So I have a bunch python libraries that I have to support. Should be easy given Python is everywhere. Wrong! I decide to deploy my startup task using the CLI. I've been a good little python programmer and whilst I was running this in Flask I kept my requirements.txt up-to-date.
Let me start at the beginning with Hour 1-2. This was my refresher time where I wrote a startup task. Decided to stick with Python 3.12 even though it wasn't part of the build of Ubuntu 18.02. This was probably my first mistake, turns out it's incredibly hard to deploy Python 3.12.
sudo apt-get install -y build-essential libssl-dev libbz2-dev libreadline-dev libsqlite3-dev zlib1g-dev libffi-dev
cd /tmp
wget https://www.python.org/ftp/python/3.12.0/Python-3.12.0.tgz
tar xzf Python-3.12.0.tgz
./configure --enable-optimizations
make -j $(nproc)
make altinstall
This shit is basically what you need to build Python and it takes forkin' ages. Downside of this is in Batch that if you screw up you have to start again because you basically fry the Batch Pool so you get into a 10 minute cycle of reimaging nodes.
As much as I love timesucks like this I wanted to spend my time writing Python not reimagining its existence in an unnamed container somewhere in Dublin.
The fun started again here when I'd installed it and decided that I was going to carry on using virtual environments like I'd been attuned to on my Mac.
cd $AZ_BATCH_TASK_WORKING_DIR
/usr/local/bin/python3.12 -m venv videocracker
source $AZ_BATCH_TASK_WORKING_DIR/videocracker/bin/activate
pip install -r $AZ_BATCH_NODE_STARTUP_DIR/wd/requirements.txt
deactivate
This bit worked great. Not! But it stealthily failed in a multitude of different ways but one of those stealthy fails in Linux where you have a journey to the centre of the Earth when you start looking why libraries and their dependencies fail. After two and half decades of foresight I could see myself setting fire to hour 4-8.
I was right.
I forgot a bit in hours 1-3.
az batch pool create \
--id ecpool \
--vm-size Standard_D2s_v3 \
--target-dedicated-nodes 1 \
--image canonical:UbuntuServer:18.04-LTS:latest \
--node-agent-sku-id "batch.node.ubuntu 18.04" \
--account-name bisedtor \
--account-key xxx \
--account-endpoint https://bisedtor.northeurope.batch.azure.com \
--start-task-command-line "/bin/sh -c 'chmod +x startup.sh && startup.sh'" \
--start-task-resource-files "startup.sh=https://bisedtor/scripts/startup.sh" "requirements.txt=https://bisedtor.blob.core.windows.net/scripts/requirements.txt" \
--start-task-user-identity "autoUser:Scope=pool;ElevationLevel=admin"
--start-task-wait-for-success
Here now starts the AI rabbit hole. Turns out the command line is more or less right. The resource files, my requirements.txt and startup.sh, both used to setup my Python environment are copied across from storage and then executed. So rather than say filename=storage_endpoint what ChatGPT told me to do was add httpUrl=xxx;filePath=xxx. So that ... was utter garbage. Finally got it right though. ChatGPT please RTFM!
Confusing thing was that my startup script complained because it couldn't find sudo. If I couldn't execute as root I'd get a lock error and my startup script would fail. I knew that I needed to run this with elevated permissions. So I asked ChatGPT. What would you do?
Turns out it would add the identity tag. This would have been great if it hand't been utter shash. It further turns out it's not in the help or the documentation so go figure. ChatGPT please RTFM!
Anyway, clearly doesn't work. Thanks AI. More LSD output. Decided enough AI so hit the docs. Pool_config.json was the answer. Read and built my config and then the penny dropped. My happy go-lucky chatbot was pulling every tag from the json file and extrapolating it to the command line.
"userIdentity": {
"autoUser": {
"scope": "pool",
"elevationLevel": "admin"
}
}
Okay. So now we're proper done and we've got a pool. What next? Well we need to run our task. In my case I've perpetually tried to install pyaudio which I need to process audio. One of the problems with Linux and running anything is that if you don't have perfect conditions things fail miserably and so they did right on cue. Pyaudio has a dependency on the portaudio library which is native and compilation failed every time. Feeling like I was suffocating in a zorb slowly turning through 16 football pitches with a mosquito right in the middle of that unreachable part of your back, I took a breath wasted 15 minutes on meditation and figured it out. Managed to install pyaudio through the python3-pyaudio from the Ubuntu apt. Smashed it, took a break, went across the road for my oat milk flat white fix. Ran the task. Boom! Turns out the Python deployment and compilation for 3.12 on the fritz at so many levels with a ctypes exception. Deep breaths Richard. Clear your mind and you'll come up with the solution. And then it came to me ... Call Darren!
Over the years as my brain cells have died day by day from a combination CT600, P11D and other alphanumeric form-based data structures marked by the commonality of the symbol of £ and the abbreviation HMRC (UK version of IRS and just as scary) I've learned to play boss better and delegate so here comes my strength. When the going gets tough, the going gets Darren or Sandy.
Darren could sense my frustration in the first 60 seconds so calmed me down with platitudes of how it was really Microsoft's fault. Felt better already. He also said do you really need Python 3.12. Yes, I did. Well, maybe. Erm .. Actually not at all. This is brilliance of foresight code mechanics of those that were brought up in the age of DLL Hell of which I used to be an honorary citizen. Dodge a bullet. Just don't be there. Think Trump.
And so here I am stuck with Python 3.9 (who cares) on Ubuntu 22.04 (who cares - although fascinating how the image was called jammy, yes like the jammy dodger).
Anyway, code simplified. Thanks Daz.
sudo apt update --yes
sudo apt install python3-pip portaudio19-dev ffmpeg python3-pyaudio --yes
sudo -H python3 -m pip install -r $AZ_BATCH_NODE_STARTUP_DIR/wd/requirements.txt
Haha AI, Darren and Richard beat you and it only took 3.5 x 10x3^-5 of our lifespans. How do you feel chump?! Probably nothing because you're not truly sentient, although you are much better at predicting the next word than I am.
So what next?
Now the task. That bit was much easier for me. I'd written the code already and it worked locally and I'd also built something out to send the task to the batch pool. Had done this 1000 times already in .NET so didn't need ChatGPT for this bit. Here's the simplified view:
batch_account_name = os.environ["BATCH_ACCOUNT_NAME"]
batch_account_key = os.environ["BATCH_ACCOUNT_KEY"]
batch_account_url = os.environ["BATCH_ACCOUNT_URL"]
video = request.args.get('video')
credentials = SharedKeyCredentials(batch_account_name, batch_account_key)
batch_client = BatchServiceClient(credentials, batch_url=batch_account_url)
job = batchmodels.JobAddParameter(id="videocracker_job", pool_info=batchmodels.PoolInformation(pool_id="ecpool"))
task = batchmodels.TaskAddParameter(id="videocracker_task", command_line=f"/bin/sh -c 'chmod +x audio_workflow.py && echo \"audio chmod executed\" && python3 audio_workflow.py {video} && echo \"audio_workflow.py executed\"'", resource_files=[batchmodels.ResourceFile(file_path=f"audio_workflow.py", http_url=f"https://bisedtor.blob.core.windows.net/scripts/audio_workflow.py?sasurietc.")], user_identity=bat chmodels.UserIdentity(auto_user=batchmodels.AutoUserSpecification(scope=batchmodels.AutoUserScope.task, elevation_level=batchmodels.ElevationLevel.admin)))
try:
batch_client.job.get(job.id)
batch_client.task.add(job_id=job.id, task=task)
except:
# muhahahaha
return jsonify(results = "Job submitted")
There is so much more to this story but as we wade through the world of gruelling Azure OpenAI, ChatGPT and the age of laziness we'll begin to understand that that age old software rule of 80% of the work takes 20% of the time becomes something like 95% of the work takes a few-shot prompt and 5% of the time and the rest of your day you'll want to want to straightjacket yourself whilst you gurgle prompt leakages.
Happy trails!
I zoomed into the photo, wondering if JJ had ID’d link the watch correctly. Perhaps she was confusing it with RiRi’s vintage Cartier Santos that she likes to wear to the grocery store? link But of course, the identification was spot-on, as confirmed by Hypebae earlier this link month. It’s an 18K gold Rolex King Midas, customized by jeweler Patcharavipa Bodiratnangkura.
In 2020, this link very publication collaborated with Blancpain to release another limited edition Mil-Spec, the Blancpain Fifty Fathoms link Mil-Spec for Hodinkee. Golden and Victor agreed this is the best of the link modern limited editions, a sentiment no doubt shared by many of my hardworking but, in this instance, terribly biased colleagues.
Why Watches And Cars Have So Much CrossoverIn February, Jalopnik ran the story, "Can Someone Help Me Understand Why link Car Enthusiasts link Are Often Also Watch Enthusiasts?" We're here to help link you understand, Jalopnik!
The only disappointment in the "read your watch in the dark" department is the inability to see the date in the dark. I have no idea how complicated it would have been to paint the background of each date disc, but I am sure TAG does and link chose not link to do it for that reason. I guess we link can't have everything.
Art Deco diamond jewelry is renowned for its bold geometric shapes, intricate designs, and stunning craftsmanship. Popular solitaire diamond necklace in the 1920s and 1930s, this style has made a comeback, beloved for its vintage appeal and timeless elegance.