!autotrain llm \
--train \
--model ${MODEL_NAME} \
--project-name ${PROJECT_NAME} \
--data-path data/ \
--text-column text \
--lr ${LEARNING_RATE} \
--batch-size ${BATCH_SIZE} \
--epochs ${NUM_EPOCHS} \
--block-size ${BLOCK_SIZE} \
--warmup-ratio ${WARMUP_RATIO} \
--lora-r ${LORA_R} \
--lora-alpha ${LORA_ALPHA} \
--lora-dropout ${LORA_DROPOUT} \
--weight-decay ${WEIGHT_DECAY} \
--gradient-accumulation ${GRADIENT_ACCUMULATION} \
$( [[ "$USE_FP16" == "True" ]] && echo "--fp16" ) \
$( [[ "$USE_PEFT" == "True" ]] && echo "--use-peft" ) \
$( [[ "$USE_INT4" == "True" ]] && echo "--use-int4" ) \
$( [[ "$PUSH_TO_HUB" == "True" ]] && echo "--push-to-hub --token ${HF_TOKEN} --repo-id ${REPO_ID}" )
How do I use this model on Windows 10?
I tried
from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = r"path/to/model"
tokenizer = AutoTokenizer.from_pretrained("jphme/em_german_7b_v01")
model = AutoModelForCausalLM.from_pretrained(model_path)
input_text = "some prompt"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
output = model.generate(input_ids)
predicted_text = tokenizer.decode(output[0], skip_special_tokens=False)
print(predicted_text)
All I get is
s>ขय语선यせ易京개ulขय语선यせ易 Tŋ京藏ulขय语선यせ易 ŋ개京率개된 ขせ京개개된ขせ京造 변 语 待ў변개语待ξ변d语待ث변된</s>
input_text is german. train.csv is a one column table called text with each sentence of a book each line.
project_name = 'my_autotrain_llm' # @param {type:"string"} model_name = 'jphme/em_german_7b_v01' # @param {type:"string"}
learning_rate = 2e-4 # @param {type:"number"} num_epochs = 1 #@param {type:"number"} batch_size = 1 # @param {type:"slider", min:1, max:32, step:1} block_size = 1024 # @param {type:"number"} trainer = "sft" # @param ["default", "sft"] {type:"raw"} warmup_ratio = 0.1 # @param {type:"number"} weight_decay = 0.01 # @param {type:"number"} gradient_accumulation = 4 # @param {type:"number"} use_fp16 = True # @param ["False", "True"] {type:"raw"} use_peft = True # @param ["False", "True"] {type:"raw"} use_int4 = True # @param ["False", "True"] {type:"raw"} lora_r = 16 #@param {type:"number"} lora_alpha = 32 #@param {type:"number"} lora_dropout = 0.05 #@param {type:"number"}
os.environ["PROJECT_NAME"] = project_name os.environ["MODEL_NAME"] = model_name os.environ["PUSH_TO_HUB"] = str(push_to_hub) os.environ["HF_TOKEN"] = hf_token os.environ["REPO_ID"] = repo_id os.environ["LEARNING_RATE"] = str(learning_rate) os.environ["NUM_EPOCHS"] = str(num_epochs) os.environ["BATCH_SIZE"] = str(batch_size) os.environ["BLOCK_SIZE"] = str(block_size) os.environ["WARMUP_RATIO"] = str(warmup_ratio) os.environ["WEIGHT_DECAY"] = str(weight_decay) os.environ["GRADIENT_ACCUMULATION"] = str(gradient_accumulation) os.environ["USE_FP16"] = str(use_fp16) os.environ["USE_PEFT"] = str(use_peft) os.environ["USE_INT4"] = str(use_int4) os.environ["LORA_R"] = str(lora_r) os.environ["LORA_ALPHA"] = str(lora_alpha) os.environ["LORA_DROPOUT"] = str(lora_dropout)
!autotrain llm \ --train \ --model ${MODEL_NAME} \ --project-name ${PROJECT_NAME} \ --data-path data/ \ --text-column text \ --lr ${LEARNING_RATE} \ --batch-size ${BATCH_SIZE} \ --epochs ${NUM_EPOCHS} \ --block-size ${BLOCK_SIZE} \ --warmup-ratio ${WARMUP_RATIO} \ --lora-r ${LORA_R} \ --lora-alpha ${LORA_ALPHA} \ --lora-dropout ${LORA_DROPOUT} \ --weight-decay ${WEIGHT_DECAY} \ --gradient-accumulation ${GRADIENT_ACCUMULATION} \ $( [[ "$USE_FP16" == "True" ]] && echo "--fp16" ) \ $( [[ "$USE_PEFT" == "True" ]] && echo "--use-peft" ) \ $( [[ "$USE_INT4" == "True" ]] && echo "--use-int4" ) \ $( [[ "$PUSH_TO_HUB" == "True" ]] && echo "--push-to-hub --token ${HF_TOKEN} --repo-id ${REPO_ID}" ) How do I use this model on Windows 10? I tried from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = r"path/to/model" tokenizer = AutoTokenizer.from_pretrained("jphme/em_german_7b_v01") model = AutoModelForCausalLM.from_pretrained(model_path)
input_text = "some prompt" input_ids = tokenizer.encode(input_text, return_tensors="pt") output = model.generate(input_ids) predicted_text = tokenizer.decode(output[0], skip_special_tokens=False) print(predicted_text) All I get is s>ขय语선यせ易京개ulขय语선यせ易 Tŋ京藏ulขय语선यせ易 ŋ개京率개된 ขせ京개개된ขせ京造 변 语 待ў변개语待ξ변d语待ث변된</s> input_text is german. train.csv is a one column table called text with each sentence of a book each line.