About Beam Prediction Task

#1
by Markydh7 - opened

To reproduce 64 types of beam predictions, I built a residual 1D-CNN model, but the accuracy of my test set is only 40% at most. Can I get your downstream model?

Hello, thanks for sharing your experience! You can find the downstream model for beam prediction here: Beam Prediction Downstream Model.

Hi @Markydh7 , I was not able to load the train and test sets (.p files) for beam prediction task on MATLAB. Could you tell me how you managed to use them?

Owner

Hello @BerkIGuler , thank you for your feedback!

The .p files associated with the challenge problems on the LWM website—LoS/NLoS Classification and Beam Prediction—are stored as Torch tensors. To convert them into MATLAB-compatible .mat files, you can use the p2mat.py script available at:
p2mat.py on Hugging Face.

For the downstream tasks in our paper, we used the following Sub-6 GHz channel scenarios:

  • city_18_denver
  • city_15_indianapolis
  • city_19_oklahoma
  • city_12_fortworth
  • city_11_santaclara
  • city_7_sandiego

Their corresponding mmWave scenarios, used for generating the best beam labels, are available here:
mmWave Scenarios on Hugging Face.

Please let us know if you have any questions!

This comment has been hidden
Markydh7 changed discussion status to closed
Markydh7 changed discussion status to open

Hello, thanks for sharing your experience! You can find the downstream model for beam prediction here: Beam Prediction Downstream Model.

Thank you for sharing the downstream model! I had a quick question—what does sequence_length represent in the model? If we’re only using the CLS token for the prediction task, wouldn’t it make sense to set sequence_length to 1? I tried doing that, but it seems to cause some issues when running the code. I’d really appreciate your insights! 😊

Owner

In the current model framework, the sequence_length is fixed at 129, representing the total number of input patches. Specifically, a channel of size (32, 32) is segmented into 128 patches, with an additional CLS patch prepended at the beginning of the sequence.

If you need only the CLS embedding as output, you should still provide all 129 input patches to the model. During inference, the model will generate 129 corresponding patches in the embedding space, transforming the input from the original channel space. To obtain the CLS embedding, simply extract the first patch from the output sequence.

To generate the CLS embedding, set the input_type variable to 'cls_emb' as outlined in the model card instructions. This ensures the model processes the input accordingly and outputs the CLS embedding patch.

In the current model framework, the sequence_length is fixed at 129, representing the total number of input patches. Specifically, a channel of size (32, 32) is segmented into 128 patches, with an additional CLS patch prepended at the beginning of the sequence.

If you need only the CLS embedding as output, you should still provide all 129 input patches to the model. During inference, the model will generate 129 corresponding patches in the embedding space, transforming the input from the original channel space. To obtain the CLS embedding, simply extract the first patch from the output sequence.

To generate the CLS embedding, set the input_type variable to 'cls_emb' as outlined in the model card instructions. This ensures the model processes the input accordingly and outputs the CLS embedding patch.

Thank you very much for your patient explanation. It was my previous misunderstanding. I have already replicated the beam prediction task based on the downstream model you provided. Thank you again!

Just to clarify, for Beam Prediction Downstream Model (1DResNet), I assume you used the following setting:

  • LWM embeddings, with size (128x64) where 64 corresponds to sequence length parameter of ResNet model, and 128 is the number of input channels of ResNet Model
  • Raw channels, with size (2x32x32) and you concatenated real and imaginary parts side by side to have (64x32) where 64 is the number of input channels and 32 is the sequence length

Also, the paper says you adjusted the model architecture to ensure fairness among raw channels and embeddings. Can you share the exact same architecture details for reproducibility?

Just to clarify, for Beam Prediction Downstream Model (1DResNet), I assume you used the following setting:

  • LWM embeddings, with size (128x64) where 64 corresponds to sequence length parameter of ResNet model, and 128 is the number of input channels of ResNet Model
  • Raw channels, with size (2x32x32) and you concatenated real and imaginary parts side by side to have (64x32) where 64 is the number of input channels and 32 is the sequence length

Also, the paper says you adjusted the model architecture to ensure fairness among raw channels and embeddings. Can you share the exact same architecture details for reproducibility?

Hello, bro! I totally get your question. I think the raw channels here is made up of 128 patches, each 16 long—which basically means reshaping 23232 into 128*16. For the model you mentioned to train on the raw channels, I figured it should be something just as complex, so I went with a ResNet too, setting the InputChannels=16 and Sequence_lengths=128. When I ran this downstream model with the raw channels for the beam prediction task, I noticed that its F1 Score seems pretty close to what you get from the Channel_embedding output. Here’s my test data—I’m not 100% sure if it’s spot on, though!

Hello @BerkIGuler ,

You can find the detailed script at this link. To directly address your question, here is a breakdown of the input channels and sequence lengths for all input types (CLS embeddings, channel embeddings, raw channels):

mapping = {
    'cls_emb': {'input_channels': 1, 'sequence_length': 64},
    'channel_emb': {'input_channels': 64, 'sequence_length': 128},
    'raw': {'input_channels': 16, 'sequence_length': 128}
}

As @Markydh7 mentioned, the raw channels are reshaped into a (128, 16) matrix, which follows a similar format to the channel embeddings (128, 64). In this setup, the sequence length refers to the first dimension (number of patches), and the input channels refer to the second dimension (patch size).

Performing downstream tasks has been made even easier in LWM 1.1, which will be released next week, along with several other added features and enhancements. Additionally, several videos will be published to address your questions and provide further guidance.

Thanks for your feedback! Please let us know if you need further clarification.

Also, the paper says you adjusted the model architecture to ensure fairness among raw channels and embeddings. Can you share the exact same architecture details for reproducibility?

Since CLS embeddings are 32 times smaller than raw channels, using the exact same architecture for both would result in a different number of parameters, leading to unequal computational complexities, which is not fair. While the core architectures are the same, we added additional fully-connected layers to the model for raw channels to ensure consistency in the number of parameters.

You do not need to exactly reproduce the downstream models we used in the paper. The essential factor is to ensure fairness in your comparison by maintaining a similar number of parameters, regardless of the model you choose for your downstream task.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment