整合饭否数据到wordpress中

来自http://tunps.com/fanfou-integrate-in-wordpress

饭否是2007年5月上线,2009.7.8饭否关闭,原因大家都知道,从此大量的饭否粉丝苦苦等待饭否的重新上线。饭否是我第一个使用的微博服务,我也在等待之列,在2010.3.1饭否推出了一个数据导出功能。

最近做了一个决定:将我的所有微博消息(饭否、嘀咕、twitter)整合到此wordpress博客中,让这里博客、微博进行“合体”。

本人先后使用饭否(2007-11-09到2009-7-06),嘀咕(2009-04-16到2010-03-23),twitter(2008-10-08至今)。

先把饭否整合到wordpress中,

饭否的数据来自备份的文件中:

album.html //照片
favorite_1.html  //收藏
privatemsg_receive_1.html //私信收件箱
privatemsg_sent_1.html //私信发件箱
status_1.html //消息

我在饭否里面没有怎么使用照片,私信不用公开了,所有只备份消息。不知道为啥每个文件都是“_1”,估计1000条消息为一个文件,我消息没有超过1000。

分析了一下status_1.html,发现加载了yui 2 javascript框架,但折腾了一下午发现yui 2用得很不上手。只有用jquery,全部代码如下:

$(document).ready(function(){
        var list=$("ol li");
        list.each(function(i){
            if($(".content")[i].innerHTML.substring(0,1) == '@')//忽略@别人的聊天消息
            {}
            else
            {
                var gmtTimeStr=$(".stamp")[i].childNodes[0].getAttribute("stime");
                var date_arr=gmtTimeStr.split(" ");
                var year     =     date_arr[5];
                var month     =     date_arr[1];
                var day        =    date_arr[2];
                var time    =    date_arr[3];
                var time_arr = time.split(":");
                var hour    = time_arr[0];
                var minute    = time_arr[1];
                var second    = time_arr[2];
                switch(month)
                {
                    //January February  March April May June July August September October November December//默写不出来鸟,老了。
                    case 'Jan': month = 01; break;
                    case 'Feb': month = 02; break;
                    case 'Mar': month = 03; break;
                    case 'Apr': month = 04; break;
                    case 'May': month = 05; break;
                    case 'Jun': month = 06; break;
                    case 'Jul': month = 07; break;
                    case 'Aug': month = 08; break;
                    case 'Sep': month = 09; break;
                    case 'Oct': month = 10; break;
                    case 'Nov': month = 11; break;
                    case 'Dec': month = 12; break;
                }
                var newGmtTimeStr     =     year + '-' + month + '-' + day + ' ' + time;
                var localTimeObj = new Date(year,month,day,hour,minute,second);
                localTimeObj.setHours(localTimeObj.getHours()+8);
                var post_name = localTimeObj.getTime();
                var date_arr=localTimeObj.toString().split(" ");
                var year     =     date_arr[3];
                var month     =     date_arr[1];
                var day        =    date_arr[2];
                var time    =    date_arr[4];
                var time_arr = time.split(":");
                var hour    = time_arr[0];
                var minute    = time_arr[1];
                var second    = time_arr[2];
                switch(month)
                {
                    //January February  March April May June July August September October November December
                    case 'Jan': month = 0; break;
                    case 'Feb': month = 1; break;
                    case 'Mar': month = 2; break;
                    case 'Apr': month = 3; break;
                    case 'May': month = 4; break;
                    case 'Jun': month = 5; break;
                    case 'Jul': month = 6; break;
                    case 'Aug': month = 7; break;
                    case 'Sep': month = 8; break;
                    case 'Oct': month = 9; break;
                    case 'Nov': month = 10; break;
                    case 'Dec': month = 11; break;
                }
                var localTime = year + '-' + month + '-' + day + ' ' + hour+':'+minute+':'+second;
                var content     =    $(".content")[i].innerHTML;
                content = content.replace(/\'/g,"\\'");
                var title = content.substring(0,60);
                $("#sql").text( $("#sql").text() + "INSERT INTO `tunps`.`wp_posts` (`ID`, `post_author`, `post_date`, `post_date_gmt`, `post_content`, `post_title`, `post_category`, `post_excerpt`, `post_status`, `comment_status`, `ping_status`, `post_password`, `post_name`, `to_ping`, `pinged`, `post_modified`, `post_modified_gmt`, `post_content_filtered`, `post_parent`, `guid`, `menu_order`, `post_type`, `post_mime_type`, `comment_count`) VALUES (NULL, '1', '"+localTime+"', '"+newGmtTimeStr+"', '"+content+"', '"+title+"', '0', '', 'publish', 'open', 'open', '', '"+post_name+"', '', '', '0000-00-00 00:00:00', '0000-00-00 00:00:00', '', '0', '', '0', 'post', '', '0');" + '\n');
            }
        });
});

以上js最好在chrome下面运行,因为太消耗内存了,chrome运行5秒钟,cpu占用50%左右,内存100多MB。差点被javascript的Date对象搞晕了,之所以耗内存cpu资源是因为代码的问题,可以解决问题就行,优化就不考虑了。
生成如下的sql脚本:

拷贝到phpmyadmin中运行就导入到了wordpress中。wordpress日志默认是按wp_posts.post_date而非post_date_gmt,这点需要注意。
如果你想让所有饭否的数据都放在一个单独的目录并命名为“碎碎念”,你可以在wordpress后台新建一个“碎碎念”的目录(数据库里面会在wp_terms和wp_term_taxonomy新建一行数据,记录下wp_term_taxonomy.term_taxonomy_id,我的是770,记录下饭否数据插入wp_posts的起止ID,我的是1736到2454,然后写个php文件:

$link_id = mysql_connect("localhost", "用户名", "密码") or
        die("Could not connect: " . mysql_error());
mysql_select_db('数据库名字');
$i=1736;
while($i<=2454)
{
    mysql_query("insert into wp_term_relationships values ($i,770,0) ");
    $i++;
}

上传到网站,运行这个php脚本,目的是为了让所有的饭否数据都归入“碎碎念”目录。
然后手动修改wp_term_taxonomy.count为饭否的消息条数就可以了。

About tunpishuang

just 4 fun·····
This entry was posted in 未分类 and tagged . Bookmark the permalink.

发表评论

电子邮件地址不会被公开。 必填项已用 * 标注

*

您可以使用这些 HTML 标签和属性: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>