Long term goal of using technology in the home to improve elderly care. Short term, I like to make AI art with Stable Diffusion using LLM's for prompt automation. Cybersecurity/DFIR background and generally enjoy breaking stuff to see how it works.
As you may have probably heard, in the past weeks three Tech Giants (Microsoft, Amazon and Google) announced that they would bet on nuclear reactors to feed the surging energy demand of data centers, driven by increasing AI data and computational flows.
π‘Andrew Ng recently gave a strong defense of Open Source AI models and the need to slow down legislative efforts in the US and the EU to restrict innovation in Open Source AI at Stanford GSB.
# Offensive Security Reconnaissance Continued with Public Facing Industrial Control System HMIs using Moondream
Building on my previous experiments with Moondream for physical security reconnaissance planning automation (https://huggingface.co/posts/Csplk/926337297827024), I've now turned my attention to exploring the potential of this powerful image-text-text model for offensive security reconnaissance in the realm of Industrial Control Systems (ICS). ICS HMIs (Human-Machine Interfaces) are increasingly exposed to the public internet, often without adequate security measures in place. This presents a tantalizing opportunity for malicious actors to exploit vulnerabilities and gain unauthorized access to critical infrastructure.
Using Moondream with batch processing (Csplk/moondream2-batch-processing), I've been experimenting with analyzing public facing ICS (Csplk/ICS_UIs) HMI (Csplk/HMI) screenshots from shodan to identify types of exposed ICS system HMIs, how they are operated and how malicious actors with access to these systems could cause damage to physical infrastructure. Feeding images of HMIs and pre-defined text prompts to Moondream batch processing successfully (unconfirmed accuracy levels) extracted information about the underlying systems, including
Next steps: * I have a longer and more in depth blog write up in the works that will cover the previous and this post's approaches for experiments for sharing via HF community blog posts soon. * I plan to continue refining my Moondream-based tool to improve its accuracy and effectiveness in processing public facing ICS HMIs. * As mentioned before, offensive security with moondream focused HF Space once its fleshed out.
These are great news for all the users with GTX 10XX, P40...
Flash Attention implementation for older NVIDIA GPUs without requiring Tensor Cores has come to llama.cpp in the last few days, and should be merged in the next version of KoboldCpp, you can already try it with another fork or by building it.
You should expect less VRAM usage for the same context, allowing you to experience higher contexts with your current GPU.
There have also been reported final tokens/second speed improvements for inference, so that's also grand!
If you have tried it, I'd like to hear your experiences with --flashattention so far, especially for this implementation and for the large number of Pascal (GTX 10XX, P40...) cards.
Discussion linked bellow, with more links to relevant information: