Behind closed doors ... Why am I so fussy about AI and AV? Because I know the “other side”
“Rawi, you’re tech person. Why aren’t you excited about all the advances?”
“My dear, it’s precisely because I’m a tech person. It’s because I have a hands-on experience with this stuff. That’s why I consider this ‘people going ape-shit’ insane.”
And the same applies to the age verification and digital privacy stuff. Because I know the “other side”. Yes, you’re reading this right. I’ve been there, on the opposing side of the barricade years ago.
Why am I writing this? Because I want to. I’m not interested in seeking some redemption. And yes, I know some of you will now want to hang me for “being one of them”. To you I say: “People can learn. People can change. If you don’t believe that, you’re not what you claim to be.”
Me and ML/AI: It will never work.
Partly because it literally can’t do all the stuff “AI-bros” think it can do and partly because none of the problems I faced a decade ago with FAR SIMPLER implementation have been resolved or at least mitigated. They’re still there and will be forever; the “inbreeding”, the need to constantly revise the inputs, the need to constantly revise the outputs because the results aren’t consistent … congratulations, you’ve replaced “busy work” with “busy work you have no idea about”.
In case you’re wondering what I’m talking about, back in my “Rawi’s going to be a scientist” days, I was working on utilising ML for analysing network traffic, especially with regards to not having to dig into the contents because A) you need a warrant for that stuff (seriously, you can’t just tcpdump whatever traffic you want and dig into it if you want to use it, even if it’s for research purposes) and B) you won’t be able to if it’s encrypted (if your ears have perked up with caution and warning signs start screaming at you in your mind … you’ve got a good hunch where I’m about to go with this). The goal was to come up with something relatively small so it could be used in a programmable hardware and basically be a preprocessor before the data went into software for more detailed processing. This not only was a challenge due to resource constraints (you can’t really squeeze in a massive ML model into a chip which still needs to fit the network interfaces AND an already present processing pipeline) but the ML part itself had huge obstacles to get over, one of them already mentioned.
Where do you get data? Well, I had a sample from when we were testing our little toy in our lab (we basically became our own guinea pigs) which I could use for training the model. But here comes the first problem: data is data. A binary mess with no meaning at all to the machine. This is where the human needs to step in and provide context clues. How do you do it? By, wait for it, sitting down and going through the pile MANUALLY. That’s right, I had to take the pile of data, take out a sample from that pile I would use for training, MANUALLY classify what is supposed to be in there and then do the same for a dataset I’ll be using for evaluation. Great, that’s weeks of super mundane busy work WHICH YOU CAN’T AUTOMATE. Why? Because machine has no brain. Someone needs to provide it. And that someone … was me.
How good was what I came up with? Well, the thing had about 95% accuracy. And it was giving a very rough overview about what might be in the data stream while it sometimes had really clear clues. And the worst part? It couldn’t retain that accuracy. I’m not kidding. I could grab the same model weeks later and run it on the very same sample without changing anything and the accuracy would plummet to 70%. Let me say it again: I took the SAME model, fed it the SAME data and it deteriorated. And it wasn’t even a complicated neural network, far from it in fact.
Then there’s the “AI inbreeding” thing; what happens when you start training the model on an output of another model or even its own output. Well, you’re getting into training the model on data without any oversight. And as you know, AI models are sycopanthic “by design” so they’ll end up in a self-affirming positive feedback loop. And if you’re in any way familiar with signal processing, you know that positive feedback is something to avoid like plague, covid and ebola combined. Long story short, feeding the model its own output for training will make it deteriorate even faster because it’ll lose the ability to “apply” the learnt patterns onto anything that’s not almost 1-to-1 to the output. Basically, your model will become a binary decider with extremely narrow margins for error.
May I add a sad story to this? At one of the meetings which served as brainstorming sessions in our research groups, one of my colleagues and friends who was working in the same field raised this question to my supervisor. I already had this concern but he had the guts (and the position) to say it out loud. When I heard that question, my brain immediately sent me “See? You’re not alone thinking this. You’re very much onto something.” And the supervisor answered to my friend:
“You’re too skeptical about it. You say it because you don’t believe in the idea.”
Does this sound familiar? The tone, the wording … Mind you this was my supervisor at the time. And this was at the time of writing seven years ago. I’m kicking myself in the arse in retrospect for not standing up and saying: “Believe? He’s right! I’ve had this on my mind for some time already and this is serious issue which basically makes this idea infeasible.” But I didn’t. I did voice some of my skepticism later in a milder way, more along the lines of being worried about it but not having enough substance to confirm it. But fucking hell, I should’ve shouted loud that this is wrong and it’s never going to work.
Oh well, can’t change the past. But I can learn from it. So when you see me going up in arms against AI, it’s not just for the moral an ethical reasons but also for the technical reasons and things that haven’t been solved and honestly, they can’t be solved.
But now for the harsher confession … the one that will probably make some people hate my guts and also provide the reasons why I was doing what I was doing.
Me and AV and Digital Privacy: I was one of them.
No this isn’t a lie. I was indeed one of the bad ones. I was indeed one of the “Bastards”. Yes, I worked for Law Enforcement, specifically for Lawful Interception of network data. THIS is why I’m so militant now when it comes to Age Verification and all the privacy invading mess. It’s also partly why I can tame my reactions to it occasionally because there’s a little bit of silver lining: it gave me the insight into the legal framework of how this stuff is used.
The latter I’d like to expand on a bit. The dark side is, my country isn’t private. It abso-fucking-lutely isn’t. We have LI laws. we have Data Retention laws. Like, we have a digital “State Security” and we had it FOR YEARS. Luckily, nobody really gives a damn. And frankly, not even the people who should enforce it. Why? Because whoo boy, the legal framework is strict about using this stuff. Remember when I mentioned you need a warrant to handle network data. This is exactly it. There’s a specific law on WHAT you can capture, HOW you can capture AND HOW you can use it. And if you break ANY of this, the court will just toss the evidence in the trash. To give you a small overview: The capture needs to be EXACT (as in nothing extra is allowed to be captured), it must be COMPLETE (you lose something and the evidence goes out the window) and it MUST NOT touch any application data that’s not the intended target. The last part is important because it’s really difficult to do at all, let alone do it right. The law is a little more lenient when it comes to metadata but you’ll get a really nasty “side-eye” if you go too broad there too.
So yeah, this stuff isn’t easy from a legal standpoint but that doesn’t give an excuse to be invasive. Not that the cops need to be because an overwhelming majority of people are doing their job for them *points at EVERY SINGLE social media profile*. By the way, I have that from someone literally working for our national secret service. Seriously, as much as they liked the tools we were working on, the person I talked to pretty much told me that “We can just look at their Facebook page because they’ll tell us everything there. Seriously, people are this bad at OpSec.”
Now, how does this relate to me being militant about AV and privacy and frankly sometimes losing my mind with “privacy freaks”? The AV part is quite obvious. It’s victim-blaming, dehumanisation of kids and basically a dream world for the likes of Gestapo/NKVD/KGB/CIA/GCHQ etc. The privacy part is, and I’m going to get a lot of flack for this, so many people are so high on their own “privacy supremacism pedestal”. Except that pedestal is so shaky that it can topple with a stronger sneeze. On one hand, one’s personal life is nobody’s business. Yes, I mean it. Nobody is entitled to know what you’re personally doing.
“But what about criminals?” Does the saying “Snitches get stiches” sound familiar? Yes, that is a huge part of why so many bad people go away with things. You don’t hold them accountable and it’s your fault. Yes, I know, power dynamics are a thing. But hear me out: If you didn’t allow things to rot and fester into this power hellscape, you wouldn’t need to be worried.
“But I’m a targetted person of ” So am I. Very much so despite not looking like it. But here’s a thing. If I was in an actually serious danger, I wouldn’t be writing these lines. Hell, we wouldn’t be having any conversation anymore because either I’d already be behind bars or a fertiliser or I’d be fighting teeth and nails to not be that.
And even with all that said, the “my personal life is nobody’s business” stands. Same goes for kids. Their personal life is nobody’s business unless it harms them. And trust me, kids UNDERSTAND VERY WELL something’s not right. The reason they often don’t tell anyone isn’t because they don’t know, but because they’re scared. And being far less experienced in this world which we made opaque and obtuse for some unknown reasons, they’re much easier to manipulate.
So yeah, the reason I’m so much against all the Age Verification and privacy-removing bullcrap and I go borderline rabid at anyone who even thinks about defending it is because I know the other side.
So there it is. A confession of someone who was the bad guy, or at least worked for the bad guys (Banality of Evil, anyone?). Am I going to piss someone off with this? Absolutely. But hear me out. I’m not hiding behind all I’ve done. I’m not justifying the behaviour I contributed to. I want to at least try and fix what I can. Can I do it easily? No. But I don’t have to do it alone.
R.R.A.