RepoFusion: Training Code Models to Understand Your Repository

Disha Shrivastava, Denis Kocetkov, Harm de Vries, Dzmitry Bahdanau, Torsten Scholak

This space contains the released resources for our paper RepoFusion: Training Code Models to Understand Your Repository. A block diagram of our approach can be found below. For more details, refer to the paper.

block diagram

Data

Stack-Repo can be accessed via the Datasets section of this space. Please see the README for complete details.

Trained Checkpoints

The trained checkpoints can be downloaded from the Models. Please see the README for complete details.

Code

The code for training and evaluating RepoFusion, finetuning CodeT5, and details of how to run the scripts can be found here

Citation

@article{shrivastava2023repofusion,
  title={RepoFusion: Training Code Models to Understand Your Repository},
  author={Shrivastava, Disha and Kocetkov, Denis and de Vries, Harm and Bahdanau, Dzmitry and Scholak, Torsten},
  journal={arXiv preprint arXiv:2306.10998},
  year={2023}
}