license: llama2

Marvinmw/{13b,34b}_reasoner

Welcome to the repository of Marvinmw/{13b,34b}_reasoner, a custom 13-billion and 34-billion parameter model built upon the Llama 2 architecture, tailored for reasoning and code analysis, especially in the domain of smart contract audits.

Model Description

Marvinmw/{13b,34b}_reasoner is based on the powerful Llama 2 model and has been fine-tuned with a significant dataset from Solodit and Code4rena. This includes over 10,000 findings from smart contract audits, making it uniquely suited for reasoning over complex smart contract code and security vulnerabilities.

Features

Base Model: Llama 2, known for its robust handling of language and code.
Fine-tuning Dataset: Over 10,000 smart contract audit findings from platforms such as Solodit and Code4rena.
Use Case: Designed primarily for developers, auditors, and researchers engaged in the security analysis of blockchain technologies and smart contracts.

Getting Started

To use Marvinmw/{13b,34b}_reasoner, follow these steps:

Prerequisites

Python 3.8 or newer
pip or conda

Installation

Install the necessary packages using pip:

pip install -r requirements.txt

Usage

You can load and use the model as follows:

from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained("MetaTrustSig/13b_reasoner")
tokenizer = AutoTokenizer.from_pretrained("MetaTrustSig/13b_reasoner")

# Example usage
text = "Insert your smart contract code or query here"
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)

from transformers import LlamaForCausalLM, LlamaTokenizer
import torch

# Path to your model directory
model_path = "MetaTrustSig/13b_reasoner"


# Load the tokenizer
tokenizer = LlamaTokenizer.from_pretrained(model_path)

# Add special tokens if they are missing
if tokenizer.eos_token is None:
    tokenizer.add_special_tokens({
        'eos_token': '</s>',
        'bos_token': '<s>',
        'unk_token': '<unk>',
        'pad_token': '<pad>'
    })

# Load the model with the language modeling head
model = LlamaForCausalLM.from_pretrained(model_path)
model.resize_token_embeddings(len(tokenizer))

# Move model to GPU if available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)

text = "YOUR INPUT"

# Tokenize input and move tensors to the appropriate device
inputs = tokenizer(text, return_tensors="pt").to(device)

# Generate text
# You can change the generation config
generated_outputs = model.generate(
    input_ids=inputs['input_ids'],
    attention_mask=inputs['attention_mask'],
    max_length=1024,
    do_sample=True,
    temperature=0.2,
    top_p=0.9,
    repetition_penalty=1.1,
    eos_token_id=tokenizer.eos_token_id,
)

# Decode the output
generated_text = tokenizer.decode(generated_outputs[0], skip_special_tokens=True)

print(generated_text)

Contributing

Contributions to Marvinmw/{13b,34b}_reasoner are welcome! Here's how you can contribute:

Issues: For bugs or feature requests, open an issue.
Pull Requests: Submit a PR to contribute with code changes or documentation updates.

Please see CONTRIBUTING.md for more details on our code of conduct and the process for submitting pull requests to us.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Thanks to the Llama 2 team for the base model.
Solodit and Code4rena for providing the dataset for fine-tuning.

Contact

For any further questions or partnership inquiries, please contact us via email at [info@metatrust.io].

Additional Information

Model Performance Metrics: If available, include details about the model's performance metrics and benchmarks.
Updates and Maintenance: Information about how the model will be updated and maintained.

We hope you find Marvinmw/{13b,34b}_reasoner useful for your smart contract security needs. Enjoy using it!