InternLM Tutorial Notes (1)
书生大模型 Tutorial Beginner Notes。
Linux
SSH 连接
InternStudio 平台创建个人开发机,使用 VS Code 进行 SSH 远程连接。
显示 GPU 状态的摘要信息,nvidia-smi:

端口映射
使用 ssh 命令进行端口映射。
运行一个 web demo,helloworld.py。
import socket
import re
import gradio as gr
# 获取主机名
def get_hostname():
hostname = socket.gethostname()
match = re.search(r'-(\d+)$', hostname)
name = match.group(1)
return name
# 创建 Gradio 界面
with gr.Blocks(gr.themes.Soft()) as demo:
html_code = f"""
<p align="center">
<a href="https://intern-ai.org.cn/home">
<img src="https://intern-ai.org.cn/assets/headerLogo-4ea34f23.svg" alt="Logo" width="20%" style="border-radius: 5px;">
</a>
</p>
<h1 style="text-align: center;">☁️ Welcome {get_hostname()} user, welcome to the ShuSheng LLM Practical Camp Course!</h1>
<h2 style="text-align: center;">😀 Let’s go on a journey through ShuSheng Island together.</h2>
<p align="center">
<a href="https://github.com/InternLM/Tutorial/blob/camp3">
<img src="https://oss.lingkongstudy.com.cn/blog/202406301604074.jpg" alt="Logo" width="20%" style="border-radius: 5px;">
</a>
</p>
"""
gr.Markdown(html_code)
demo.launch()
Web IDE 的终端中运行时,没有进行端口映射时,使用本地IP无法访问,端口映射后,在网页中打开链接就可以看到 web ui 的界面。
VS Code 提供了自动端口映射的功能,不需要手动配置。

Python
Wordcount
实现一个 wordcount 函数,统计英文字符串中每个单词出现的次数。返回一个字典,key 为单词,value 为对应单词出现的次数。
Example:
Input:
"""Hello world!
This is an example.
Word count is fun.
Is it fun to count words?
Yes, it is fun!"""
Output:
{'hello': 1,'world!': 1,'this': 1,'is': 3,'an': 1,'example': 1,'word': 1,
'count': 2,'fun': 1,'Is': 1,'it': 2,'to': 1,'words': 1,'Yes': 1,'fun': 1 }
函数实现,这里假设 ' 是 it's 的一部分,不会去掉并拆分为 it is。简单起见,只考虑实例输入中存在的标点符号:
def wordcount(text: str):
punctuation = [",", ".", "!", "?"]
for p in punctuation:
text = text.replace(p, "")
text = text.lower()
words = text.split()
word_count = {}
for word in words:
if word not in word_count:
word_count[word] = 1
else:
word_count[word] += 1
return word_count
if __name__ == "__main__":
text = """
Got this panda plush toy for my daughter's birthday,
who loves it and takes it everywhere. It's soft and
super cute, and its face has a friendly look. It's
a bit small for what I paid though. I think there
might be other options that are bigger for the
same price. It arrived a day earlier than expected,
so I got to play with it myself before I gave it
to her.
"""
result = wordcount(text)
print(result)
Debug
在代码中需要检查的地方点击行号左侧设置断点。例如,在 words = text.split() 以及 word_count = {} 这两行设置断点。
调试过程中检查变量值的状态变化。
使用字典 word_counts 统计每个单词出现的次数。如果单词已经在字典中,将其计数加1;如果单词不在字典中,初始化计数为1。

调试结果,程序正常执行,输出正确的单词计数结果:

Git
Github 笔记仓库
参考
https://github.com/InternLM/Tutorial
最后修改于 2024-07-10