云原生Hiveonk8s环境部署
一、概述
Hive是基于Hadoop的一个数据仓库(DataAarehouse,简称数仓、DW),可以将结构化的数据文件映射为一张数据库表,并提供类SQL查询功能。是用于存储、分析、报告的数据系统。这里只讲部署,相关概念可以参考我这篇文章:大数据Hadoop之数据仓库Hive
Hive架构
Hive客户端架构
二、开始部署
因为hive依赖与Hadoop,所以这里是在把hive封装在Hadoophaonk8s编排中,关于更多,可以参考:【云原生】HadoopHAonk8s环境部署1)构建镜像
DockerfileFROMmyharbor。combigdatacentos:7。9。2009RUNrmfetclocaltimelnsvusrsharezoneinfoAsiaShanghaietclocaltimeechoAsiaShanghaietctimezoneRUNexportLANGzhCN。UTF8创建用户和用户组,跟yaml编排里的spec。template。spec。containers。securityContext。runAsUser:9999RUNgroupaddsystemgid9999adminuseraddsystemhomedirhomeadminuid9999gidadminadmin安装sudoRUNyumyinstallsudo;chmod640etcsudoers给admin添加sudo权限RUNechoadminALL(ALL)NOPASSWD:ALLetcsudoersRUNyumyinstallinstallnettoolstelnetwgetRUNmkdiroptapacheADDjdk8u212linuxx64。tar。gzoptapacheENVJAVAHOMEoptapachejdk1。8。0212ENVPATHJAVAHOMEbin:PATHENVHADOOPVERSION3。3。2ENVHADOOPHOMEoptapachehadoopENVHADOOPCOMMONHOME{HADOOPHOME}HADOOPHDFSHOME{HADOOPHOME}HADOOPMAPREDHOME{HADOOPHOME}HADOOPYARNHOME{HADOOPHOME}HADOOPCONFDIR{HADOOPHOME}etchadoopPATH{PATH}:{HADOOPHOME}binRUNcurlsilentoutputtmphadoop。tgzhttps:ftpstud。hsesslingen。depubMirrorsftp。apache。orgdisthadoopcommonhadoop{HADOOPVERSION}hadoop{HADOOPVERSION}。tar。gztardirectoryoptapachexzftmphadoop。tgzrmtmphadoop。tgzADDhadoop{HADOOPVERSION}。tar。gzoptapacheRUNlnsoptapachehadoop{HADOOPVERSION}{HADOOPHOME}ENVHIVEVERSION3。1。2ADDhive{HIVEVERSION}。tar。gzoptapacheENVHIVEHOMEoptapachehiveENVPATHHIVEHOMEbin:PATHRUNlnsoptapachehive{HIVEVERSION}{HIVEHOME}RUNchownRadmin:adminoptapacheWORKDIRoptapacheHdfsportsEXPOSE500105002050070500755009080209000MapredportsEXPOSE19888YarnportsEXPOSE8030803180328033804080428088OtherportsEXPOSE497072122
开始构建镜像dockerbuildtmyharbor。combigdatahadoophive:v3。3。23。1。2。nocache参数解释t:指定镜像名称。:当前目录Dockerfilef:指定Dockerfile路径nocache:不缓存dockerpushmyharbor。combigdatahadoophive:v3。3。23。1。22)添加Metastore服务编排1、配置
hadooptemplateshivehiveconfigmap。yamlapiVersion:v1kind:ConfigMapmetadata:name:{{includehadoop。fullname。}}hivelabels:app。kubernetes。ioname:{{includehadoop。name。}}helm。shchart:{{includehadoop。chart。}}app。kubernetes。ioinstance:{{。Release。Name}}data:hivesite。xml:lt;?xmlversion1。0?lt;?xmlstylesheettypetextxslhrefconfiguration。xsl?configuration!配置hdfs存储目录propertynamehive。metastore。warehouse。dirnamevalueuserhiveremotewarehousevaluepropertypropertynamehive。metastore。localnamevaluefalsevalueproperty!所连接的MySQL数据库的地址,hivelocal是数据库,程序会自动创建,自定义就行propertynamejavax。jdo。option。ConnectionURLnamevaluejdbc:mysql:mysqlprimaryheadless。mysql:3306hivemetastore?createDatabaseIfNotExisttrueuseSSLfalseserverTimezoneAsiaShanghaivalueproperty!MySQL驱动propertynamejavax。jdo。option。ConnectionDriverNamenamevaluecom。mysql。cj。jdbc。Drivervalue!valuecom。mysql。jdbc。Drivervalueproperty!mysql连接用户propertynamejavax。jdo。option。ConnectionUserNamenamevaluerootvalueproperty!mysql连接密码propertynamejavax。jdo。option。ConnectionPasswordnamevalueWyfORdvwVmvalueproperty!元数据是否校验propertynamehive。metastore。schema。verificationnamevaluefalsevaluepropertypropertynamesystem:user。namenamevaluerootvaluedescriptionusernamedescriptionpropertypropertynamehive。metastore。urisnamevaluethrift:{{includehadoop。fullname。}}hivemetastore。{{。Release。Namespace}}。svc。cluster。local:9083valueproperty!hostpropertynamehive。server2。thrift。bind。hostnamevalue0。0。0。0valuedescriptionBindhostonwhichtoruntheHiveServer2Thriftservice。descriptionproperty!hs2端口默认是1000,为了区别,我这里不使用默认端口propertynamehive。server2。thrift。portnamevalue10000valuepropertypropertynamehive。server2。active。passive。ha。enablenamevaluetruevaluepropertyconfiguration2、控制器
hadooptemplateshivehiveserver2statefulset。yamlapiVersion:appsv1kind:StatefulSetmetadata:name:{{includehadoop。fullname。}}hivemetastoreannotations:checksumconfig:{{include(print。Template。BasePathhadoopconfigmap。yaml)。sha256sum}}labels:app。kubernetes。ioname:{{includehadoop。name。}}helm。shchart:{{includehadoop。chart。}}app。kubernetes。ioinstance:{{。Release。Name}}app。kubernetes。iocomponent:hivemetastorespec:serviceName:{{includehadoop。fullname。}}hivemetastoreselector:matchLabels:app。kubernetes。ioname:{{includehadoop。name。}}app。kubernetes。ioinstance:{{。Release。Name}}app。kubernetes。iocomponent:hivemetastorereplicas:{{。Values。hive。metastore。replicas}}template:metadata:labels:app。kubernetes。ioname:{{includehadoop。name。}}app。kubernetes。ioinstance:{{。Release。Name}}app。kubernetes。iocomponent:hivemetastorespec:affinity:podAntiAffinity:{{ifeq。Values。antiAffinityhard}}requiredDuringSchedulingIgnoredDuringExecution:topologyKey:kubernetes。iohostnamelabelSelector:matchLabels:app。kubernetes。ioname:{{includehadoop。name。}}app。kubernetes。ioinstance:{{。Release。Name}}app。kubernetes。iocomponent:hivemetastore{{elseifeq。Values。antiAffinitysoft}}preferredDuringSchedulingIgnoredDuringExecution:weight:5podAffinityTerm:topologyKey:kubernetes。iohostnamelabelSelector:matchLabels:app。kubernetes。ioname:{{includehadoop。name。}}app。kubernetes。ioinstance:{{。Release。Name}}app。kubernetes。iocomponent:hivemetastore{{end}}terminationGracePeriodSeconds:0initContainers:name:waitnnimage:{{。Values。image。repository}}:{{。Values。image。tag}}command:〔sh,c,untilcurlm3sIhttp:{{includehadoop。fullname。}}hdfsnn{{sub。Values。hdfs。nameNode。replicas1}}。{{includehadoop。fullname。}}hdfsnn。{{。Release。Namespace}}。svc。cluster。local:9870egrepsilentHTTP1。1200OKHTTP1。1302Found;doechowaitingfornn;sleep1;done〕containers:name:hivemetastoreimage:{{。Values。image。repository}}:{{。Values。image。tag}}imagePullPolicy:{{。Values。image。pullPolicyquote}}command:binbashoptapachetmphadoopconfigbootstrap。shdresources:{{toYaml。Values。hive。metastore。resourcesindent10}}readinessProbe:tcpSocket:port:9083initialDelaySeconds:10timeoutSeconds:2livenessProbe:tcpSocket:port:9083initialDelaySeconds:10timeoutSeconds:2volumeMounts:name:hadoopconfigmountPath:optapachetmphadoopconfigname:hiveconfigmountPath:optapachehiveconfsecurityContext:runAsUser:{{。Values。securityContext。runAsUser}}privileged:{{。Values。securityContext。privileged}}volumes:name:hadoopconfigconfigMap:name:{{includehadoop。fullname。}}name:hiveconfigconfigMap:name:{{includehadoop。fullname。}}hive3、Service
hadooptemplateshivemetastoresvc。yamlAheadlessservicetocreateDNSrecordsapiVersion:v1kind:Servicemetadata:name:{{includehadoop。fullname。}}hivemetastorelabels:app。kubernetes。ioname:{{includehadoop。name。}}helm。shchart:{{includehadoop。chart。}}app。kubernetes。ioinstance:{{。Release。Name}}app。kubernetes。iocomponent:hivemetastorespec:ports:name:metastoreport:{{。Values。service。hive。metastore。port}}nodePort:{{。Values。service。hive。metastore。nodePort}}type:{{。Values。service。hive。metastore。type}}selector:app。kubernetes。ioname:{{includehadoop。name。}}app。kubernetes。ioinstance:{{。Release。Name}}app。kubernetes。iocomponent:hivemetastore3)添加HiveServer2服务编排1、控制器
hadooptemplateshivehiveserver2statefulset。yamlapiVersion:appsv1kind:StatefulSetmetadata:name:{{includehadoop。fullname。}}hivehiveserver2annotations:checksumconfig:{{include(print。Template。BasePathhadoopconfigmap。yaml)。sha256sum}}labels:app。kubernetes。ioname:{{includehadoop。name。}}helm。shchart:{{includehadoop。chart。}}app。kubernetes。ioinstance:{{。Release。Name}}app。kubernetes。iocomponent:hivehiveserver2spec:serviceName:{{includehadoop。fullname。}}hivehiveserver2selector:matchLabels:app。kubernetes。ioname:{{includehadoop。name。}}app。kubernetes。ioinstance:{{。Release。Name}}app。kubernetes。iocomponent:hivehiveserver2replicas:{{。Values。hive。hiveserver2。replicas}}template:metadata:labels:app。kubernetes。ioname:{{includehadoop。name。}}app。kubernetes。ioinstance:{{。Release。Name}}app。kubernetes。iocomponent:hivehiveserver2spec:affinity:podAntiAffinity:{{ifeq。Values。antiAffinityhard}}requiredDuringSchedulingIgnoredDuringExecution:topologyKey:kubernetes。iohostnamelabelSelector:matchLabels:app。kubernetes。ioname:{{includehadoop。name。}}app。kubernetes。ioinstance:{{。Release。Name}}app。kubernetes。iocomponent:hivehiveserver2{{elseifeq。Values。antiAffinitysoft}}preferredDuringSchedulingIgnoredDuringExecution:weight:5podAffinityTerm:topologyKey:kubernetes。iohostnamelabelSelector:matchLabels:app。kubernetes。ioname:{{includehadoop。name。}}app。kubernetes。ioinstance:{{。Release。Name}}app。kubernetes。iocomponent:hivehiveserver2{{end}}terminationGracePeriodSeconds:0initContainers:name:waithivemetastoreimage:{{。Values。image。repository}}:{{。Values。image。tag}}command:〔sh,c,until(echoq)telneteq{{includehadoop。fullname。}}hivemetastore。{{。Release。Namespace}}。svc。cluster。local{{。Values。service。hive。metastore。port}}devnull21;doechowaitingforhivemetastore;sleep1;done〕containers:name:hivehiveserver2image:{{。Values。image。repository}}:{{。Values。image。tag}}imagePullPolicy:{{。Values。image。pullPolicyquote}}command:binbashoptapachetmphadoopconfigbootstrap。shdresources:{{toYaml。Values。hive。metastore。resourcesindent10}}readinessProbe:tcpSocket:port:10000initialDelaySeconds:10timeoutSeconds:2livenessProbe:tcpSocket:port:10000initialDelaySeconds:10timeoutSeconds:2volumeMounts:name:hadoopconfigmountPath:optapachetmphadoopconfigname:hiveconfigmountPath:optapachehiveconfsecurityContext:runAsUser:{{。Values。securityContext。runAsUser}}privileged:{{。Values。securityContext。privileged}}volumes:name:hadoopconfigconfigMap:name:{{includehadoop。fullname。}}name:hiveconfigconfigMap:name:{{includehadoop。fullname。}}hive2、Service
hadooptemplateshivehiveserver2svc。yamlAheadlessservicetocreateDNSrecordsapiVersion:v1kind:Servicemetadata:name:{{includehadoop。fullname。}}hivehiveserver2labels:app。kubernetes。ioname:{{includehadoop。name。}}helm。shchart:{{includehadoop。chart。}}app。kubernetes。ioinstance:{{。Release。Name}}app。kubernetes。iocomponent:hivehiveserver2spec:ports:name:metastoreport:{{。Values。service。hive。hiveserver2。port}}nodePort:{{。Values。service。hive。hiveserver2。nodePort}}type:{{。Values。service。hive。hiveserver2。type}}selector:app。kubernetes。ioname:{{includehadoop。name。}}app。kubernetes。ioinstance:{{。Release。Name}}app。kubernetes。iocomponent:hivehiveserver24)修改values。yaml
hadoopvalues。yamlimage:repository:myharbor。combigdatahadoophivetag:v3。3。23。1。2pullPolicy:IfNotPresentTheversionofthehadooplibrariesbeingusedintheimage。hadoopVersion:3。3。2logLevel:INFOSelectantiAffinityaseitherhardorsoft,defaultissoftantiAffinity:softhdfs:nameNode:replicas:2pdbMinAvailable:1resources:requests:memory:256Micpu:10mlimits:memory:2048Micpu:1000mdataNode:Willbeusedasdfs。datanode。hostnameYoustillneedtosetupservicesingressforeveryDNDatanodeswillexpecttoexternalHostname:example。comexternalDataPortRangeStart:9866externalHTTPPortRangeStart:9864replicas:3pdbMinAvailable:1resources:requests:memory:256Micpu:10mlimits:memory:2048Micpu:1000mwebhdfs:enabled:truejounralNode:replicas:3pdbMinAvailable:1resources:requests:memory:256Micpu:10mlimits:memory:2048Micpu:1000mhive:metastore:replicas:1pdbMinAvailable:1resources:requests:memory:256Micpu:10mlimits:memory:2048Micpu:1000mhiveserver2:replicas:1pdbMinAvailable:1resources:requests:memory:256Micpu:10mlimits:memory:1024Micpu:500myarn:resourceManager:pdbMinAvailable:1replicas:2resources:requests:memory:256Micpu:10mlimits:memory:2048Micpu:2000mnodeManager:pdbMinAvailable:1ThenumberofYARNNodeManagerinstances。replicas:1Createstatefulsetsinparallel(K8S1。7)parallelCreate:falseCPUandmemoryresourcesallocatedtoeachnodemanagerpod。Thisshouldbetunedtofityourworkload。resources:requests:memory:256Micpu:500mlimits:memory:2048Micpu:1000mpersistence:nameNode:enabled:truestorageClass:hadoophannlocalstorageaccessMode:ReadWriteOncesize:1Gilocal:name:hadoophann0host:local168182110path:optbigdataservershadoophanndatadata1name:hadoophann1host:local168182111path:optbigdataservershadoophanndatadata1dataNode:enabled:trueenabledStorageClass:falsestorageClass:hadoophadnlocalstorageaccessMode:ReadWriteOncesize:1Gilocal:name:hadoophadn0host:local168182110path:optbigdataservershadoophadndatadata1name:hadoophadn1host:local168182110path:optbigdataservershadoophadndatadata2name:hadoophadn2host:local168182110path:optbigdataservershadoophadndatadata3name:hadoophadn3host:local168182111path:optbigdataservershadoophadndatadata1name:hadoophadn4host:local168182111path:optbigdataservershadoophadndatadata2name:hadoophadn5host:local168182111path:optbigdataservershadoophadndatadata3name:hadoophadn6host:local168182112path:optbigdataservershadoophadndatadata1name:hadoophadn7host:local168182112path:optbigdataservershadoophadndatadata2name:hadoophadn8host:local168182112path:optbigdataservershadoophadndatadata3volumes:name:dfs1mountPath:optapachehdfsdatanode1hostPath:optbigdataservershadoophadndatadata1name:dfs2mountPath:optapachehdfsdatanode2hostPath:optbigdataservershadoophadndatadata2name:dfs3mountPath:optapachehdfsdatanode3hostPath:optbigdataservershadoophadndatadata3journalNode:enabled:truestorageClass:hadoophajnlocalstorageaccessMode:ReadWriteOncesize:1Gilocal:name:hadoophajn0host:local168182110path:optbigdataservershadoophajndatadata1name:hadoophajn1host:local168182111path:optbigdataservershadoophajndatadata1name:hadoophajn2host:local168182112path:optbigdataservershadoophajndatadata1volumes:name:jnmountPath:optapachehdfsjournalnodeservice:nameNode:type:NodePortports:dfs:9000webhdfs:9870nodePorts:dfs:30900webhdfs:30870nameNode1:type:NodePortports:webhdfs:9870nodePorts:webhdfs:31870nameNode2:type:NodePortports:webhdfs:9870nodePorts:webhdfs:31871dataNode:type:NodePortports:webhdfs:9864nodePorts:webhdfs:30864resourceManager:type:NodePortports:web:8088nodePorts:web:30088resourceManager1:type:NodePortports:web:8088nodePorts:web:31088resourceManager2:type:NodePortports:web:8088nodePorts:web:31089journalNode:type:ClusterIPports:jn:8485nodePorts:jn:hive:metastore:type:NodePortport:9083nodePort:31183hiveserver2:type:NodePortport:10000nodePort:30000securityContext:runAsUser:9999privileged:true5)开始部署更新helmupgradehadoopha。hadoopnhadoopha重新安装helminstallhadoopha。hadoopnhadoophacreatenamespace更新helmupgradehadoopha。hadoopnhadoopha
NOTESNAME:hadoophaLASTDEPLOYED:ThuSep2923:42:022022NAMESPACE:hadoophaSTATUS:deployedREVISION:1TESTSUITE:NoneNOTES:1。YoucancheckthestatusofHDFSbyrunningthiscommand:kubectlexecnhadoophaithadoophahadoophdfsnn0opthadoopbinhdfsdfsadminreport2。Youcanlisttheyarnnodesbyrunningthiscommand:kubectlexecnhadoophaithadoophahadoopyarnrm0opthadoopbinyarnnodelist3。CreateaportforwardtotheyarnresourcemanagerUI:kubectlportforwardnhadoophahadoophahadoopyarnrm08088:8088Thenopentheuiinyourbrowser:openhttp:localhost:80884。Youcanrunincludedhadooptestslikethis:kubectlexecnhadoophaithadoophahadoopyarnnm0opthadoopbinhadoopjaropthadoopsharehadoopmapreducehadoopmapreduceclientjobclient3。3。2tests。jarTestDFSIOwritenrFiles5fileSize128MBresFiletmpTestDFSIOwrite。txt5。Youcanlistthemapreducejobslikethis:kubectlexecnhadoophaithadoophahadoopyarnrm0opthadoopbinmapredjoblist6。Thischartcanalsobeusedwiththezeppelincharthelminstallnamespacehadoophasethadoop。useConfigMaptrue,hadoop。configMapNamehadoophahadoopstablezeppelin7。Youcanscalethenumberofyarnnodeslikethis:helmupgradehadoophasetyarn。nodeManager。replicas4stablehadoopMakesuretoupdatethevalues。yamlifyouwanttomakethispermanent。
6)测试验证
查看kubectlgetpods,svcnhadoophaowide
测试beelineujdbc:hive2:localhost:10000nadmincreatedatabasetest;CREATETABLEIFNOTEXISTStest。person1(idINTCOMMENTID,nameSTRINGCOMMENT名字,ageINTCOMMENT年龄,likesARRAYSTRINGCOMMENT爱好,addressMAPSTRING,STRINGCOMMENT地址)ROWFORMATDELIMITEDFIELDSTERMINATEDBY,COLLECTIONITEMSTERMINATEDBYMAPKEYSTERMINATEDBY:LINESTERMINATEDBY;
7)卸载helmuninstallhadoophanhadoophakubectldeletepodnhadoophakubectlgetpodnhadoophaawkNR1{print1}forcekubectlpatchnshadoophap{metadata:{finalizers:null}}kubectldeletenshadoophaforcermfroptbigdataservershadoopha{nn,dn,jn}datadata{1。。3}
git下载地址:https:gitee。comhadoopbigdatahadoophaonk8s
这里只是把hive相关的部分编排列出来了,有疑问的小伙伴欢迎给我留言,对应的修改也提交到git上了,有需要的小伙伴自行下载,hive的编排部署就先到这里了,后续会持续分享【云原生大数据】相关的教程,请小伙伴耐心等待