GIT初始版源码解析
一、背景
一方面想对git底层工作原理有更多理解,另外观摩下Linus大神的编码思路和风格二、版本选择
git项目地址:https:github。comgitgit。git
开源库随着功能增加代码越来越庞大,并且主干思想也会被越来越多的分支细节所淹没,所以直接选择git第一个版本代码进行参考。时间回到2005年4月7号的下午。commite83c5163316f89bfbde7d9ab23ca2e25604af290(HEAD)Author:LinusTorvaldstorvaldsppc970。osdl。orgDate:ThuApr715:13:1320050700
进到目录里看一下,只有几个文件,总共代码行才1000出头
三、代码运行
reviewdebug是学习代码库非常有效的方法,所以先让代码跑起来。mac上尝试编译,出现一些警告以及错误。本地做一些修改后编译通过。有同学要自己尝试动手,可以参照以下修改:
1、安装openssl库以及zlib库
brewinstallopenssl
brewinstallzlib
2、修改编译以及链接选项并指定头文件以及库位置同时关闭弃用函数报警
CFLAGSgWnodeprecatedIusrlocaloptopensslinclude
LDFLAGSLusrlocaloptopenssllibLusrlocaloptzliblib
3、链接库修改,从lssl改为lcryptolz
4、main函数增加返回值、修改时间相关结构体
5、m1芯片mac没有找到可用gdb版本,可以使用lldb代替
以下是具体修改点:〔graypig:〕gitdiffdiffgitaMakefilebMakefileindexa6bba79ba1。。fe779bdb75100644aMakefilebMakefile1,41,5CFLAGSgCFLAGSgWnodeprecatedIusrlocaloptopensslincludeLDFLAGSLusrlocaloptopenssllibLusrlocaloptzliblibCCgccPROGupdatecacheshowdiffinitdbwritetreereadtreecommittreecatfile8,279,27all:(PROG)install:(PROG)install(PROG)(HOME)binLIBSlsslLIBSlcryptolzinitdb:initdb。oupdatecache:updatecache。oreadcache。o(CC)(CFLAGS)oupdatecacheupdatecache。oreadcache。o(LIBS)(CC)(CFLAGS)(LDFLAGS)oupdatecacheupdatecache。oreadcache。o(LIBS)showdiff:showdiff。oreadcache。o(CC)(CFLAGS)oshowdiffshowdiff。oreadcache。o(LIBS)(CC)(CFLAGS)(LDFLAGS)oshowdiffshowdiff。oreadcache。o(LIBS)writetree:writetree。oreadcache。o(CC)(CFLAGS)owritetreewritetree。oreadcache。o(LIBS)(CC)(CFLAGS)(LDFLAGS)owritetreewritetree。oreadcache。o(LIBS)readtree:readtree。oreadcache。o(CC)(CFLAGS)oreadtreereadtree。oreadcache。o(LIBS)(CC)(CFLAGS)(LDFLAGS)oreadtreereadtree。oreadcache。o(LIBS)committree:committree。oreadcache。o(CC)(CFLAGS)ocommittreecommittree。oreadcache。o(LIBS)(CC)(CFLAGS)(LDFLAGS)ocommittreecommittree。oreadcache。o(LIBS)catfile:catfile。oreadcache。o(CC)(CFLAGS)ocatfilecatfile。oreadcache。o(LIBS)(CC)(CFLAGS)(LDFLAGS)ocatfilecatfile。oreadcache。o(LIBS)readcache。o:cache。hshowdiff。o:cache。hdiffgitacache。hbcache。hindex98a32a9ad3。。161a5aff90100644acache。hbcache。h12,612,7includeopensslsha。hincludezlib。hincludestring。hBasicdatastructuresforthedirectorycachediffgitainitdb。cbinitdb。cindex25dc13fe10。。d11b16bff5100644ainitdb。cbinitdb。c19,819,8intmain(intargc,charargv)sha1dirgetenv(DBENVIRONMENT);if(sha1dir){structstatst;if(!stat(sha1dir,st)0SISDIR(st。stmode))return;if(!(stat(sha1dir,st)0)SISDIR(st。stmode))return0;fprintf(stderr,DBENVIRONMENTsettobaddirectorys:,sha1dir);}diffgitashowdiff。cbshowdiff。cindexb8522886a1。。6d00ba2a6f100644ashowdiff。cbshowdiff。c11,1111,11staticintmatchstat(structcacheentryce,structstatst){unsignedintchanged0;if(cemtime。sec!(unsignedint)ststmtim。tvseccemtime。nsec!(unsignedint)ststmtim。tvnsec)if(cemtime。sec!(unsignedint)ststmtimespec。tvseccemtime。nsec!(unsignedint)ststmtimespec。tvnsec)changedMTIMECHANGED;if(cectime。sec!(unsignedint)ststctim。tvseccectime。nsec!(unsignedint)ststctim。tvnsec)if(cectime。sec!(unsignedint)ststctimespec。tvseccectime。nsec!(unsignedint)ststctimespec。tvnsec)changedCTIMECHANGED;if(cestuid!(unsignedint)ststuidcestgid!(unsignedint)ststgid)diffgitaupdatecache。cbupdatecache。cindex5085a5cb53。。f9c8e0fc69100644aupdatecache。cbupdatecache。c139,9139,9staticintaddfiletocache(charpath)memset(ce,0,size);memcpy(cename,path,namelen);cectime。secst。stctime;cectime。nsecst。stctim。tvnsec;cectime。nsecst。stctimespec。tvnsec;cemtime。secst。stmtime;cemtime。nsecst。stmtim。tvnsec;cemtime。nsecst。stmtimespec。tvnsec;cestdevst。stdev;cestinost。stino;cestmodest。stmode;四、源码分析1、initdb。c
核心逻辑:创建缓存目录。dircacheobjects,并且在此目录下预创建256个目录,命名规则
。dircacheobjects00。dircacheobjects01。dircacheobjects。。。。dircacheobjectsffincludecache。hintmain(intargc,charargv){charsha1dirgetenv(DBENVIRONMENT),path;intlen,i,fd;if(mkdir(。dircache,0700)0){perror(unabletocreate。dircache);exit(1);}Ifyouwantto,youcansharetheDBareawithanynumberofbranches。Thathasadvantages:youcansavespacebysharingalltheSHA1objects。Ontheotherhand,itmightjustmakelookupslowerandmessier。Youbethejudge。sha1dirgetenv(DBENVIRONMENT);if(sha1dir){structstatst;if(!stat(sha1dir,st)0SISDIR(st。stmode))return;fprintf(stderr,DBENVIRONMENTsettobaddirectorys:,sha1dir);}ThedefaultcaseistohaveaDBpermanageddirectory。sha1dirDEFAULTDBENVIRONMENT;fprintf(stderr,defaultingtoprivatestoragearea);lenstrlen(sha1dir);if(mkdir(sha1dir,0700)0){if(errno!EEXIST){perror(sha1dir);exit(1);}}注意malloc申请内存后不会清零,但是使用sprintf格式化会在末尾添加,所以不存在越界问题pathmalloc(len40);memcpy(path,sha1dir,len);for(i0;i256;i){两个16进制字符格式打印sprintf(pathlen,02x,i);if(mkdir(path,0700)0){if(errno!EEXIST){perror(path);exit(1);}}}return0;}2、updatecache。c
缓存项设计经过仔细考量,可以直接利用文件字节流还原内存缓存项结构,省掉了拷贝动作。核心逻辑:
首先读取。dircacheindex的文件内容,对要加入缓存的文件进行校验后,进行zlib压缩并计算sha1值,按照sha1计算文件的路径。dircacheobjectsxxxx{19},保存文件然后更新全局cache信息,并将全局cache保存到磁盘上生成新的。dircacheindex。
文件内容索引文件格式:blobsizenullzlib压缩后的文件内容。intmain(intargc,charargv){inti,newfd,entries;entriesreadcache();if(entries0){perror(cachecorrupted);return1;}newfdopen(。dircacheindex。lock,ORDWROCREATOEXCL,0600);if(newfd0){perror(unabletocreatenewcachefile);return1;}for(i1;iargc;i){charpathargv〔i〕;判断路径是否合法,排除:。。。结尾if(!verifypath(path)){fprintf(stderr,Ignoringpaths,argv〔i〕);continue;}if(addfiletocache(path)){fprintf(stderr,Unabletoaddstodatabase,path);gotoout;}}if(!writecache(newfd,activecache,activenr)!rename(。dircacheindex。lock,。dircacheindex))return0;out:unlink(。dircacheindex。lock);}2。1缓存读取逻辑
readcache读取缓存逻辑:打开缓存文件。dircacheindex,通过mmap将文件映射到内存,校验文件sha1,根据头enty个数还原缓存数据。intreadcache(void){intfd,i;structstatst;unsignedlongsize,offset;voidmap;structcacheheaderhdr;errnoEBUSY;if(activecache)returnerror(morethanonecachefile);errnoENOENT;sha1filedirectorygetenv(DBENVIRONMENT);if(!sha1filedirectory)sha1filedirectoryDEFAULTDBENVIRONMENT;if(access(sha1filedirectory,XOK)0)returnerror(noaccesstoSHA1filedirectory);fdopen(。dircacheindex,ORDONLY);if(fd0)return(errnoENOENT)?0:error(openfailed);map(void)1;if(!fstat(fd,st)){mapNULL;sizest。stsize;errnoEINVAL;if(sizesizeof(structcacheheader))mapmmap(NULL,size,PROTREAD,MAPPRIVATE,fd,0);}close(fd);if(1(int)(long)map)returnerror(mmapfailed);hdrmap;if(verifyhdr(hdr,size)0)gotounmap;根据缓存数量来申请内存,预留1。5倍空间activenrhdrentries;activeallocallocnr(activenr);activecachecalloc(activealloc,sizeof(structcacheentry));通过文件字节直接还原内存结构offsetsizeof(hdr);for(i0;ihdrentries;i){structcacheentrycemapoffset;offsetoffsetcesize(ce);activecache〔i〕ce;}returnactivenr;unmap:munmap(map,size);errnoEINVAL;returnerror(verifyheaderfailed);}
verifyhdr校验缓存头:通过缓存重新计算sha1,跟缓存头sha1对比进行校验staticintverifyhdr(structcacheheaderhdr,unsignedlongsize){SHACTXc;unsignedcharsha1〔20〕;基础校验,签名版本if(hdrsignature!CACHESIGNATURE)returnerror(badsignature);if(hdrversion!1)returnerror(badversion);SHA1Init(c);提取缓存头中除了sha1部分的数据SHA1Update(c,hdr,offsetof(structcacheheader,sha1));提取缓存内容数据,hdr1是指跳过缓存头SHA1Update(c,hdr1,sizesizeof(hdr));计算sha1SHA1Final(sha1,c);对比sha1if(memcmp(sha1,hdrsha1,20))returnerror(badheadersha1);return0;}
特殊宏函数说明:structcacheentry{structcachetimectime;structcachetimemtime;unsignedintstdev;unsignedintstino;unsignedintstmode;unsignedintstuid;unsignedintstgid;unsignedintstsize;unsignedcharsha1〔20〕;unsignedshortnamelen;0长度字符数组,并不占用空间unsignedcharname〔0〕;};计算缓存项的长度definecesize(ce)cacheentrysize((ce)namelen)offsetof(structcacheentry,name)获取name在结构体中的偏移,即除去name之外的缓存项目大小7将最低3位置0,也就是说将最终的长度对8对齐8为了防止将最低3位置0后大小变小,因此提前8来预留空间definecacheentrysize(len)((offsetof(structcacheentry,name)(len)8)7)2。2文件加入缓存逻辑
获取文件meta信息,给文件建立索引,将文件加入缓存entry。staticintaddfiletocache(charpath){intsize,namelen;structcacheentryce;structstatst;intfd;fdopen(path,ORDONLY);if(fd0){if(errnoENOENT)returnremovefilefromcache(path);return1;}if(fstat(fd,st)0){close(fd);return1;}namelenstrlen(path);sizecacheentrysize(namelen);cemalloc(size);memset(ce,0,size);memcpy(cename,path,namelen);cectime。secst。stctime;cectime。nsecst。stctimespec。tvnsec;cemtime。secst。stmtime;cemtime。nsecst。stmtimespec。tvnsec;cestdevst。stdev;cestinost。stino;cestmodest。stmode;cestuidst。stuid;cestgidst。stgid;cestsizest。stsize;cenamelennamelen;if(indexfd(path,namelen,ce,fd,st)0)return1;returnaddcacheentry(ce);}
文件建索引流程:将文件mmap到内存,使用zlib压缩meta信息(blobsizenullbyte),压缩文件内容,计算sha1,根据sha1计算缓存文件名,写入缓存文件。staticintindexfd(constcharpath,intnamelen,structcacheentryce,intfd,structstatst){zstreamstream;intmaxoutbytesnamelenststsize200;voidoutmalloc(maxoutbytes);voidmetadatamalloc(namelen200);voidinmmap(NULL,ststsize,PROTREAD,MAPPRIVATE,fd,0);SHACTXc;close(fd);if(!out(int)(long)in1)return1;memset(stream,0,sizeof(stream));deflateInit(stream,ZBESTCOMPRESSION);压缩meta信息ASCIIsizenulbytestream。nextinmetadata;stream。availin1sprintf(metadata,bloblu,(unsignedlong)ststsize);stream。nextoutout;stream。availoutmaxoutbytes;while(deflate(stream,0)ZOK)nothing;Filecontent压缩文件内容stream。nextinin;stream。availinststsize;while(deflate(stream,ZFINISH)ZOK)nothing;deflateEnd(stream);SHA1Init(c);SHA1Update(c,out,stream。totalout);计算sha1SHA1Final(cesha1,c);文件内容写入缓存returnwritesha1buffer(cesha1,out,stream。totalout);}intwritesha1buffer(unsignedcharsha1,voidbuf,unsignedintsize){charfilenamesha1filename(sha1);inti,fd;fdopen(filename,OWRONLYOCREATOEXCL,0666);if(fd0)return(errnoEEXIST)?0:1;write(fd,buf,size);close(fd);return0;}
根据哈希值计算文件名:第一个哈希值决定目录,剩余的19个哈希值决定文件名NOTE!Thisreturnsastaticallyallocatedbuffer,soyouhavetobecarefulaboutusingit。Doastrdup()ifyouneedtosavethefilename。charsha1filename(unsignedcharsha1){inti;staticcharname,base;if(!base){charsha1filedirectorygetenv(DBENVIRONMENT)?:DEFAULTDBENVIRONMENT;intlenstrlen(sha1filedirectory);basemalloc(len60);memcpy(base,sha1filedirectory,len);memset(baselen,0,60);。dircacheobjectsxxxx{19}base〔len〕;base〔len3〕;namebaselen1;}for(i0;i20;i){staticcharhex〔〕0123456789abcdef;unsignedintvalsha1〔i〕;根据哈希值计算文件名。第一个哈希值决定目录,剩余的19个哈希值决定文件名i0是用来跳过,第一个哈希值在前,剩余的19个哈希值在后charposnamei2(i0);poshex〔val4〕;poshex〔val0xf〕;}returnbase;}
addcacheentry将文件加入缓存:缓存按照文件路径排序,二分查找。staticintaddcacheentry(structcacheentryce){intpos;poscachenamepos(cename,cenamelen);existingmatch?Justreplaceitif(pos0){activecache〔pos1〕ce;return0;}Makesurethearrayisbigenough。。if(activenractivealloc){activeallocallocnr(activealloc);activecacherealloc(activecache,activeallocsizeof(structcacheentry));}Additin。。activenr;要插入的位置不在最后,从pos开始元素向后移动if(activenrpos)memmove(activecachepos1,activecachepos,(activenrpos1)sizeof(ce));activecache〔pos〕ce;return0;}
cachenamepos根据名字获取缓存项的位置,二分查找。这个函数返回值比较特殊,没有找到返回最后一次查找first(0),找到了返回p1(0)。这样设计,基于性能考虑,找到时返回了位置,在没有找到的时候返回了要插入的位置。staticintcachenamepos(constcharname,intnamelen){intfirst,last;first0;lastactivenr;while(lastfirst){intnext(lastfirst)1;structcacheentryceactivecache〔next〕;intcmpcachenamecompare(name,namelen,cename,cenamelen);if(!cmp)returnnext1;if(cmp0){lastnext;continue;}firstnext1;}returnfirst;}
cachenamecompare,先比较名称,再比较长度,0相等,1小于,1大于staticintcachenamecompare(constcharname1,intlen1,constcharname2,intlen2){intlenlen1len2?len1:len2;intcmp;cmpmemcmp(name1,name2,len);if(cmp)returncmp;if(len1len2)return1;if(len1len2)return1;return0;}3、showdiff。c
核心逻辑:首先读取缓存,针对缓存中的每个entry,根据meta判断文件当前是否有变更,如果有打印文件路径以及sha1,并且根据sha1找到文件并解压文件内容,并调用系统的diff(diffu{name})命令打印差异。对diff命令,代表标准输入
intmain(intargc,charargv){intentriesreadcache();inti;if(entries0){perror(readcache);exit(1);}for(i0;ientries;i){structstatst;structcacheentryceactivecache〔i〕;intn,changed;unsignedintmode;unsignedlongsize;chartype〔20〕;voidnew;if(stat(cename,st)0){printf(s:s,cename,strerror(errno));continue;}changedmatchstat(ce,st);if(!changed){printf(s:ok,cename);continue;}printf(。s:,cenamelen,cename);for(n0;n20;n)printf(02x,cesha1〔n〕);printf();newreadsha1file(cesha1,type,size);showdifferences(ce,st,new,size);free(new);}return0;}
showdifferences执行系统命令diff打印差异staticvoidshowdifferences(structcacheentryce,structstatcur,voidoldcontents,unsignedlonglongoldsize){staticcharcmd〔1000〕;FILEf;snprintf(cmd,sizeof(cmd),diffus,cename);fpopen(cmd,w);fwrite(oldcontents,oldsize,1,f);pclose(f);}4、catfile。c
核心逻辑:按照sha1计算缓存文件名,读取文件解压将内容写入临时文件,并且打印类型以及长度intmain(intargc,charargv){unsignedcharsha1〔20〕;chartype〔20〕;voidbuf;unsignedlongsize;chartemplate〔〕tempgitfileXXXXXX;intfd;if(argc!2getsha1hex(argv〔1〕,sha1))usage(catfile:catfilesha1);bufreadsha1file(sha1,type,size);if(!buf)exit(1);fdmkstemp(template);if(fd0)usage(unabletocreatetempfile);if(write(fd,buf,size)!size)strcpy(type,bad);printf(s:s,template,type);}
5、writetree。c
核心逻辑:读取文件缓存数据,组成树内容。内容格式:treesizenullmodenamenull〔modenamenull〕modenamenull。然后根据文件内容计算sha1,根据sha1计算文件路径,将压缩后的数据写入文件intmain(intargc,charargv){unsignedlongsize,offset,val;inti,entriesreadcache();charbuffer;if(entries0){fprintf(stderr,Nofilecachetocreateatreeof);exit(1);}Guessataninitialsizesizeentries40400;buffermalloc(size);offsetORIGOFFSET;for(i0;ientries;i){structcacheentryceactivecache〔i〕;if(checkvalidsha1(cesha1)0)exit(1);空间不够重新申请if(offsetcenamelen60size){sizeallocnr(offsetcenamelen60);bufferrealloc(buffer,size);}格式:十进制权限文件名NULLsha1offsetsprintf(bufferoffset,os,cestmode,cename);buffer〔offset〕0;memcpy(bufferoffset,cesha1,20);offset20;}offsetORIGOFFSET数据长度ORIGOFFSET数据偏移将数据长度写到预留空间的尾部,向前填入tree,并调整bufferoffset位置整体数据格式:treesizenullmodenamenullsha1〔modenamenullsha1〕。。。modenamenullsha1iprependinteger(buffer,offsetORIGOFFSET,ORIGOFFSET);i5;memcpy(bufferi,tree,5);bufferi;offseti;writesha1file(buffer,offset);return0;}
prependinteger从i个位置向前以字符串形式填写val,并返回新的istaticintprependinteger(charbuffer,unsignedval,inti){buffer〔i〕;do{buffer〔i〕0(val10);val10;}while(val);returni;}
数据样例:
xbufx60bbuf
https:wenku。baidu。comview62a4aea6e63a580216fc700abb68a98271feacb0。html?wkts1676432746759bdQuerylldbE8BF9EE7BBADE58685E5AD98
6、committree。c
基础逻辑:校验参数后,获取当前登录用户的密码相关信息,用来获取用户名、email,记录changgelog。记录当前commitsha1,parentsha1、author、committer以及评论信息,调整缓存头commitsize。根据文件内容sha1计算文件名,并保存到object目录。intmain(intargc,charargv){inti,len;intparents0;unsignedchartreesha1〔20〕;unsignedcharparentsha1〔MAXPARENT〕〔20〕;chargecos,realgecos;charemail,realemail〔1000〕;chardate,realdate;charcomment〔1000〕;structpasswdpw;timetnow;charbuffer;unsignedintsize;if(argc2getsha1hex(argv〔1〕,treesha1)0)usage(committreesha1〔psha1〕changelog);for(i2;iargc;i2){chara,b;aargv〔i〕;bargv〔i1〕;if(!bstrcmp(a,p)getsha1hex(b,parentsha1〔parents〕))usage(committreesha1〔psha1〕changelog);parents;}if(!parents)fprintf(stderr,Committinginitialtrees,argv〔1〕);读取当前用户密码信息,用来记录changelogpwgetpwuid(getuid());if(!pw)usage(Youdontexist。Goaway!);realgecospwpwgecos;lenstrlen(pwpwname);memcpy(realemail,pwpwname,len);realemail〔len〕;gethostname(realemaillen1,sizeof(realemail)len1);time(now);realdatectime(now);gecosgetenv(COMMITTERNAME)?:realgecos;emailgetenv(COMMITTEREMAIL)?:realemail;dategetenv(COMMITTERDATE)?:realdate;removespecial(gecos);removespecial(realgecos);removespecial(email);removespecial(realemail);removespecial(date);removespecial(realdate);initbuffer(buffer,size);addbuffer(buffer,size,trees,sha1tohex(treesha1));NOTE!Thisorderingmeansthatthesameexacttreemergedwithadifferentorderofparentswillbeadifferentchangesetevenifeverythingelsestaysthesame。for(i0;iparents;i)addbuffer(buffer,size,parents,sha1tohex(parentsha1〔i〕));Persondateinformationaddbuffer(buffer,size,authorslt;ss,gecos,email,date);addbuffer(buffer,size,committerslt;ss,realgecos,realemail,realdate);Andaddthecommentwhile(fgets(comment,sizeof(comment),stdin)!NULL)addbuffer(buffer,size,s,comment);finishbuffer(commit,buffer,size);writesha1file(buffer,size);return0;}
缓存处理逻辑:初始化了16K基本缓存大小,预留了40字节头信息,每32krealloc一次内存。代码存在BUG,应该是笔误,16k32k应该设置成一样大小,否则特殊场景会崩。defineBLOCKING(1ul14)defineORIGOFFSET(40)Leavespaceatthebeginningtoinsertthetagonceweknowhowbigthingsare。FIXME!Sharethecodewithwritetree。cstaticvoidinitbuffer(charbufp,unsignedintsizep){charbufmalloc(BLOCKING);memset(buf,0,ORIGOFFSET);sizepORIGOFFSET;bufpbuf;}staticvoidaddbuffer(charbufp,unsignedintsizep,constcharfmt,。。。){charoneline〔2048〕;valistargs;intlen;unsignedlongalloc,size,newsize;charbuf;vastart(args,fmt);lenvsnprintf(oneline,sizeof(oneline),fmt,args);vaend(args);sizesizep;newsizesizelen;alloc(size32767)32767;bufbufp;if(newsizealloc){alloc(newsize32767)32767;bufrealloc(buf,alloc);bufpbuf;}sizepnewsize;memcpy(bufsize,oneline,len);}
五、总结
设计巧妙,代码简洁工整,注重性能,注释自由1、基础模型
git里两个基本概念:TheObjectDatabase、CurrentDirectoryCache
TheObjectDatabase:
对象数据库,对象内容采用zlib压缩,对象名采用sha1,包含三类对象,BLOB(普通文件内容)、TREE(文件权限名称sha1集合,表示一次提交的内容)、
CHANGESET(TREE父子链,表示变更历史)。
CurrentDirectoryCache:
git暂存区,当前缓存的文件的META信息2、功能维度
git第一版代码保持了linux工具链风格,每个工具只干一件事情,底层工具组合在一起完成代码管理功能
1)updatecache:gitadd雏形,保存最新文件内容到objects里,并更新本地目录缓存
2)showdiff:gitstatus雏形,实现了缓存中的文件与最新状态差异对比
3)writetree:gitcommit雏形1,保存工作区最新缓存树到objects目录并生成sha1
4)committree:gitcommit雏形2,保存提交的树的sha1以及的parent树的sha1到object目录并生成sha1
理解了这些工具实现逻辑,不难想象目前git的各种命令和概念的原理。比如分支,分支本质只是一个changeset的sha1,基于sha1可以反向追溯每一次提交的tree。要实现两次提交diff,对比两个tree可以找到目录差异以及变化的文件,基于文件的sha1可以找到文件进而对比出文件的变化。分支拷贝,底层操作只需要拷贝一个sha1值,等等。3、性能维度
实现功能同时充分考量性能,缓存项头格式设计、二分查找返回值的设计、文件内容头信息、文件访问采用mmap避免内核缓冲区到用户缓冲区数据拷贝
虽然对常规业务来讲,可读性高于性能,但随手可得的优化是程序员基本素养