wi-lab/lwm · About Beam Prediction Task

Dec 24, 2024

To reproduce 64 types of beam predictions, I built a residual 1D-CNN model, but the accuracy of my test set is only 40% at most. Can I get your downstream model?

wi-lab

Owner Jan 1

•

edited Jan 13

Hello, thanks for sharing your experience! You can find the downstream model for beam prediction here: Beam Prediction Downstream Model.

BerkIGuler

Feb 8

Hi @Markydh7 , I was not able to load the train and test sets (.p files) for beam prediction task on MATLAB. Could you tell me how you managed to use them?

wi-lab

Owner Feb 11

Hello @BerkIGuler , thank you for your feedback!

The .p files associated with the challenge problems on the LWM website—LoS/NLoS Classification and Beam Prediction—are stored as Torch tensors. To convert them into MATLAB-compatible .mat files, you can use the p2mat.py script available at:
p2mat.py on Hugging Face.

For the downstream tasks in our paper, we used the following Sub-6 GHz channel scenarios:

city_18_denver
city_15_indianapolis
city_19_oklahoma
city_12_fortworth
city_11_santaclara
city_7_sandiego

Their corresponding mmWave scenarios, used for generating the best beam labels, are available here:
mmWave Scenarios on Hugging Face.

Please let us know if you have any questions!

Markydh7

Feb 18

This comment has been hidden

Markydh7 changed discussion status to closed Feb 18

Markydh7 changed discussion status to open Feb 18

Markydh7

Feb 19

Hello, thanks for sharing your experience! You can find the downstream model for beam prediction here: Beam Prediction Downstream Model.

Thank you for sharing the downstream model! I had a quick question—what does sequence_length represent in the model? If we’re only using the CLS token for the prediction task, wouldn’t it make sense to set sequence_length to 1? I tried doing that, but it seems to cause some issues when running the code. I’d really appreciate your insights! 😊

wi-lab

Owner Feb 20

In the current model framework, the sequence_length is fixed at 129, representing the total number of input patches. Specifically, a channel of size (32, 32) is segmented into 128 patches, with an additional CLS patch prepended at the beginning of the sequence.

If you need only the CLS embedding as output, you should still provide all 129 input patches to the model. During inference, the model will generate 129 corresponding patches in the embedding space, transforming the input from the original channel space. To obtain the CLS embedding, simply extract the first patch from the output sequence.

To generate the CLS embedding, set the input_type variable to 'cls_emb' as outlined in the model card instructions. This ensures the model processes the input accordingly and outputs the CLS embedding patch.

Markydh7

Feb 27

In the current model framework, the sequence_length is fixed at 129, representing the total number of input patches. Specifically, a channel of size (32, 32) is segmented into 128 patches, with an additional CLS patch prepended at the beginning of the sequence.

If you need only the CLS embedding as output, you should still provide all 129 input patches to the model. During inference, the model will generate 129 corresponding patches in the embedding space, transforming the input from the original channel space. To obtain the CLS embedding, simply extract the first patch from the output sequence.

To generate the CLS embedding, set the input_type variable to 'cls_emb' as outlined in the model card instructions. This ensures the model processes the input accordingly and outputs the CLS embedding patch.

Thank you very much for your patient explanation. It was my previous misunderstanding. I have already replicated the beam prediction task based on the downstream model you provided. Thank you again!

BerkIGuler

27 days ago

•

edited 27 days ago

Just to clarify, for Beam Prediction Downstream Model (1DResNet), I assume you used the following setting:

LWM embeddings, with size (128x64) where 64 corresponds to sequence length parameter of ResNet model, and 128 is the number of input channels of ResNet Model
Raw channels, with size (2x32x32) and you concatenated real and imaginary parts side by side to have (64x32) where 64 is the number of input channels and 32 is the sequence length

Also, the paper says you adjusted the model architecture to ensure fairness among raw channels and embeddings. Can you share the exact same architecture details for reproducibility?

Markydh7

25 days ago

Just to clarify, for Beam Prediction Downstream Model (1DResNet), I assume you used the following setting:

LWM embeddings, with size (128x64) where 64 corresponds to sequence length parameter of ResNet model, and 128 is the number of input channels of ResNet Model

Raw channels, with size (2x32x32) and you concatenated real and imaginary parts side by side to have (64x32) where 64 is the number of input channels and 32 is the sequence length

Also, the paper says you adjusted the model architecture to ensure fairness among raw channels and embeddings. Can you share the exact same architecture details for reproducibility?

Hello, bro! I totally get your question. I think the raw channels here is made up of 128 patches, each 16 long—which basically means reshaping 23232 into 128*16. For the model you mentioned to train on the raw channels, I figured it should be something just as complex, so I went with a ResNet too, setting the InputChannels=16 and Sequence_lengths=128. When I ran this downstream model with the raw channels for the beam prediction task, I noticed that its F1 Score seems pretty close to what you get from the Channel_embedding output. Here’s my test data—I’m not 100% sure if it’s spot on, though!

wi-lab

Owner 18 days ago

Hello @BerkIGuler ,

You can find the detailed script at this link. To directly address your question, here is a breakdown of the input channels and sequence lengths for all input types (CLS embeddings, channel embeddings, raw channels):

mapping = {
    'cls_emb': {'input_channels': 1, 'sequence_length': 64},
    'channel_emb': {'input_channels': 64, 'sequence_length': 128},
    'raw': {'input_channels': 16, 'sequence_length': 128}
}

As @Markydh7 mentioned, the raw channels are reshaped into a (128, 16) matrix, which follows a similar format to the channel embeddings (128, 64). In this setup, the sequence length refers to the first dimension (number of patches), and the input channels refer to the second dimension (patch size).

Performing downstream tasks has been made even easier in LWM 1.1, which will be released next week, along with several other added features and enhancements. Additionally, several videos will be published to address your questions and provide further guidance.

Thanks for your feedback! Please let us know if you need further clarification.

wi-lab

Owner 18 days ago

Also, the paper says you adjusted the model architecture to ensure fairness among raw channels and embeddings. Can you share the exact same architecture details for reproducibility?

Since CLS embeddings are 32 times smaller than raw channels, using the exact same architecture for both would result in a different number of parameters, leading to unequal computational complexities, which is not fair. While the core architectures are the same, we added additional fully-connected layers to the model for raw channels to ensure consistency in the number of parameters.

You do not need to exactly reproduce the downstream models we used in the paper. The essential factor is to ensure fairness in your comparison by maintaining a similar number of parameters, regardless of the model you choose for your downstream task.